Peripheral nervous system specific sodium channels, DNA encoding therefor, crystallization, X-ray diffraction, computer molecular modeling, rational drug design, drug screening, and methods of making and using thereof

ABSTRACT

PCT No. PCT/US95/14251 Sec. 371 Date May 2, 1997 Sec. 102(e) Date May 2, 1997 PCT Filed Nov. 2, 1995 PCT Pub. No. WO96/14077 PCT Pub. Date May 17, 1996Cloning, expression, viral and delivery vectors and hosts which contain nucleic acid coding for at least one peripheral nervous system specific (PNS) sodium channel peptide (SCP), isolated PNS SCP, and compounds and compositions and methods, are provided, for isolating, crystallizing, x-ray analysing molecular modeling, rational drug designing, selecting, making and using therapeutic or diagnostic agents or ligands having at least one peripheral nervous system specific (PNS) sodium channel (SC) modulating activity.

STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY-SPONSORED RESEARCH AND DEVELOPMENT

The present invention was made with U.S. government support. Therefore, the U.S. government has certain rights in the invention.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation-in-part of U.S. application Ser. No. 08/482,401, filed Jun. 7, 1995 now abandoned, which is a continuation-in-part of U.S. application Ser. No. 08/334,029 filed Nov. 2, 1994 now abandoned, both of which disclosures are entirely incorporated herein by reference

FIELD OF THE INVENTION

The present invention is in the fields of biotechnology, protein purification and crystallization, x-ray diffraction analysis, three-dimensional computer molecular modeling, and rational drug design (RDD). The invention is directed to isolated peripheral nervous system (PNS) specific sodium channel proteins (SCPs) and encoding nucleic acid, as well as to compounds, compositions and methods for selecting, making and using therapeutic or diagnostic agents having sodium channel modulating activity. The present invention further provides three-dimensional computer modeling of the PNS SCP, and for RDD, based on the use of x-ray data and/or amino acid sequence data on computer readable media.

BACKGROUND OF THE INVENTION

Voltage-sensitive ion channels are a class of transmembrane proteins that provide a basis for cellular excitability, as the ability to transmit information via ion-generated membrane potentials. In response to changes in membrane potentials, these molecules mediate rapid ion flux through highly selective pores in a nerve cell membrane. If the channel density is high enough, a suitable regenerative depolarization results, termed the action potential.

The voltage-sensitive sodium channel is the ion channel most often responsible for generating the action potential in excitable cells. Although sodium-based action potentials in different excitable tissues look similar (Hille, B., In: Ionic Channels of Excitable Membranes, B. Hille, ed., Sinauer, Sunderland, Mass., (1984), pp. 70-71) recent electrophysiological studies indicate that sodium channels in different cells differ in both their structural and functional properties, and many sodium channels with distinct primary structures have now been identified. See, e.g. Mandel, J. Membrane Biol. 125:193-205 (1992).

Functionally distinct sodium channels have been described in a variety of neuronal cell types (Llinas et al., J. Physiol. 305:197-213 (1980); Kostyuk et al., Neuroscience 6:2423-2430 (1981); Bossu et al., Neurosci. Lett. 51:241-246 (1984) 1981; Gilly et al., Nature 309:448-450 (1984); French et al., Neurosci. Lett. 56:289-294 (1985); Ikeda et al., J. Neurophysiol. 55:527-539 (1986); Jones et al., J. Physiol. 389:605-627 (1987); Alonso & Llinas, 1989; Gilly et al., J. Neurosci. 9:1362-1374 (1989)) and in skeletal muscle (Gonoi et al., J. Neurosci. 5:2559-2564 (1985); Weiss et al., Science 233:361-364 (1986)). The kinetics of sodium currents in glia and neurons can also be distinguished (Barres et al., Neuron 2: 1375-1388 (1989)).

The type II and type III genes, expressed widely in the central nervous system (CNS), are expressed at very low levels in some cells in the PNS (Beckh, S., FEBS Lett. 262:317-322 (1990)). The type II and III mRNAs were barely detectable, by Northern blot analysis, in dorsal root ganglion (DRG), cranial nerves and sciatic nerves. On the other hand, type I mRNA was present in moderately high amounts in DRG and cranial nerve, but in low levels in sciatic nerve. A comparison of the amount of all three brain mRNAs, relative to total sodium channel mRNA detected with a conserved cDNA probe, suggested the presence of additional, as yet unidentified, sodium channel types in DRG neurons. Consistent with the mRNA studies, immunochemical studies showed that neither type I nor type II sodium channel alpha subunits made up a significant component of the total sodium channels in the superior cervical ganglion or sciatic nerve (Gordon et al., Proc. Natl. Acad Sci. USA 84:8682-8686 (1987)).

A population of neurons in vertebrate DRG has been identified electrophysiologically that contains, in addition to the more conventional channels, a distinct sodium channel type; this DRG channel has a k_(D) for tetrodotoxin TTX approximately tenfold higher than the k_(D) of sodium channels in either skeletal muscle or heart (Jones et al., J. Physiol. 389:605-627 (1987)).

The localization of different sodium channels to specific regions in the nervous system supports the possibility that cell-specific regulation of this gene family is at the transcriptional level. By analogy with other eukaryotic genes, distinct DNA elements can be present which mediate cell-specific and temporal regulation of individual sodium channel genes.

Studies of sodium channel gene regulation have been facilitated by the use of well-characterized cell lines, such as pheochromocytoma (PC12) cells, a popular cell model for neuronal differentiation (Green et al., Proc. Natl. Acad Sci. USA 73:2424-2428 (1976); Halegoua et al., Curr. Top. Microbiol. Immunol. 165:119-170 (1991)). In addition to extending neurites and initiating synthesis of certain neurotransmitters, NGF-treated PC12 cells acquire the ability to generate sodium-based action potentials (Dichter et al., Nature 268:501-504 (1977)). This ability is conferred by an increase in the density of functional sodium channels in the membranes of the NGF-treated cells (Rudy et al., J. Neurosci. 7:1613-1625 (1987); Mandel et al., Proc. Natl. Acad. Sci. USA 85:924-928 (1988); O'Lague et al., Proc. Natl. Acad Sci. USA 77:1701-1705 (1980)). Northern blot analysis revealed that undifferentiated PC12 cells contained a basal level of sodium channel mRNA which increased coincident with the increase in channel activity observed after treatment with NGF (Mandel et al., Proc. Natl. Acad. Sci. USA 85:924-928 (1988)).

There is a long standing need to diagnose and/or treat pathologies relating to impaired peripheral nervous system (PNS) nerve conduction associated with PNS injury or in genetic or other disease states, such as those involving lack of, or defects in, PNS sodium channels (SCs). In view of the possibility of cell or tissue specific sodium channels, the discovery and use of isolated PNS SCs and encoding nucleic acid would provide an opportunity to diagnose or treat such pathologies by either screening suitable PNS SC modulating drugs or molecules (e.g., analgesics), or by using recombinant PNS SCs for in situ or in vivo gene therapy to replace or supplement PNS SCs in at least one portion of the peripheral nervous system of a mammalian patient suffering from a PNS SC related pathology.

SUMMARY OF THE INVENTION

The present invention (hereinafter, "invention") provides peripheral nervous system specific (PNS) sodium channel peptides (SCPs), encoding nucleic acid, vectors, host cells and antibodies, as well as methods of making and using thereof, including recombinant expression, purification, cell-based drug screening, gene therapy, crystallization, X-ray diffraction analysis, as well as computer structure determination and rational drug design utilizing at least one PNS SCP amino acid sequence and/or x-ray diffraction data provided on computer readable media.

The invention also includes oligonucleotide probes specific for PNS SCP encoding sequences, as well as methods for dectection in a sample, where the probe is labeled. The invention further includes methods for producing a PNS SCP, comprising culturing a host in a culture medium, comprising a PNS SCP nucleic acid; and isolating the PNS SCP from said host or said culture medium.

The invention additionally includes an antibody which binds an epitope specific for a PNS SCP, as well as host cells which express the antibody. Diagnostic or therapeutic methods using the antibody are also included in the invention.

The invention further includes gene therapy methods and delivery vectors comprising nucleic acid encoding, or complementary to, at least one PNS SCP, and pharmaceutically acceptable compositions thereof.

The invention also includes gene therapy by methods that administer an antisense PNS SCP nucleic acid to an animal in amount effective to provide a PNS SC modulating effect, such as an analgesic effect.

The present invention further provides methods for purifying and crystallizing a PNS SCP that can be analyzed to obtain x-ray diffraction patterns of sufficiently high resolution to be useful for three-dimensional molecular modeling of the protein. The x-ray diffraction data, atomic coordinates, and/or amino acid sequences provided on computer readable medium, are modeled on computer systems, using methods of the invention, to generate secondary, tertiary and/or quaternary structures of a PNS SCP, which structures contribute to their overall three dimensional structure, as well as binding and active sites of the PNS SCP.

Molecular modeling methods and computer systems are also provided by the present invention for rational drug design (RDD). These drug design methods use computer modeling programs to find potential ligands or agents that are calculated to bind with sites or domains on the PNS SCP. Potential ligands or agents are then screened for modulating or binding activity. Such screening methods can be selected from assays for at least one biological activity of the protein, as associated with a PNS SCP-related pathology or trauma, according to known sodium channel assays. The resulting ligands provided by methods of the present invention are synthesized and are useful for treating, inhibiting or preventing at least one of PCS SCP-related pathology or trauma in a mammal.

Further objects, features, utilities, embodiments and/or advantages of the present invention will be apparent from the additional description provided herein.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-B depicts a 323 amino acid and corresponding 969 nucleotide sequence of a PNS SCP as amino acids 233-555 of SEQ ID NO:2 and nucleotides 699-1665 of SEQ ID NO:1, as the primary structure of Domain III of the Peripheral Nerve type I (PN1) sodium channel alpha (α) subunit for both amino acid and DNA sequences. The single amino acid code is used to denote deduced amino acids. YJ1 and YOIC refer to the oligonucleotide primers used to obtain the initial PCR fragment of PN1 cDNA.

FIGS. 2A-B shows a Northern blot analysis of sodium channel α subunit mRNA in rat pheochromocytoma (PC12) cells treated with Nerve Growth Factor. In FIG. 2(A), the probe used is pRB211 which encodes the highly conserved fourth repeated domain of the rat type II sodium channel. Both type H and PN1 mRNAs are detected with this probe. In FIG. 2(B), the probe used contains sequences specific for PN1. The levels of sodium channel mRNA are quantitated with reference to the amount of cyclophilin mRNA, as indicated. Control cells are PC 12 cells grown in the absence of NGF.

FIGS. 3A-B shows an example of tissue-specific distribution of PN1 mRNA. FIG. 3(A) presents a Northern blot analysis using equal amounts of RNA from tissues. PN1 mRNA is indicated by the dash. 28S refers to the 28S rRNA. The probe contains sequences specific for the PN1 gene. Note the absence of PN1 mRNA in skeletal muscle, cardiac muscle, and the low levels of PN1 mRNA in spinal cord. FIG. 3(B) shows RNAase protection analysis of PN1 mRNA. PN1 refers to the PN1 probe protected by mRNA from the different tissue samples. Actin refers to actin probe sequences protected by the same mRNA.

FIGS. 4A-F shows localization of PN1 mRNA in Superior Cervical Ganglion (SCG) and Dorsal Root Ganglion (DRG) tissues by in situ hybridization analysis. FIGS. 4A-4B represent neurons hybridized with a PN1-specific antisense RNA probe. FIGS. 4C-4D represent neurons hybridized with the radiolabeled PN1 probe in the presence of non-labeled PN1 competitor DNA. FIGS. 4E-4F represent tissue sections hybridized with an antisense type II probe.

FIG. 5 shows a blot analysis comparing Levels of PN1 and brain type I a subunit mRNA in SCG. The pRB11 conserved sodium channel probe detects both type II/IIA and PN1 transcripts.

FIGS. 6A-B shows a Northern blot analysis which reveals differential expression of PN1 and type I sodium channel mRNAs during postnatal rat development. FIG. 6(A) shows a representative autoradiogram of a Northern blot using radiolabeled antisense pRB211 RNA as probe. Postnatal days 7 (P7) to 42 (P42) are shown. FIG. 6(B) shows a plot of quantitation of the Northern blots showing a decrease in type I mRNA with time after birth.

FIGS. 7A-D show the deduced primary structure of cloned portion of PN1 a subunit cDNA as a partial 3033 nucleotide (SEQ ID NO:1) sequence and a partial 1011 amino acid (SEQ ID NO:2) sequence.

FIGS. 8A-D show a comparison of deduced primary amino acid sequences of PN1 (1-988 of SEQ ID NO:2) and brain type II/IIA α subunit (SEQ ID NO:7). A consensus sequence is also shown (SEQ ID NO:8)

FIGS. 9A-C show the entire DNA sequence for a rat PN1 PNS SCP(SEQ ID NO:9).

FIG. 10 shows the entire amino sequence for a rat PN1 PNS SCP (SEQ ID NO:10).

FIGS. 11A-F shows amino acid sequences for rat PN1 ("RATPN1") (SEQ ID NO:10) and two expected human PN1 sequences "HUMPN1A" (SEQ ID NO:11) "HUMPN1B" (SEQ ID NO:16) HUMPN1C (SEQ ID NO:15) and HUMPN1D (SEQ ID NO:12). Alternative sequences include those where "X" is 0, 1, 2, or 3 of the same or different amino acids, which can be optionally selected from Table 1 or Table 2.

FIG. 12 shows a computer system suitable for three dimensional structure determination and/or rational drug design.

FIGS. 13A-D shows a representative DNA sequence encoding a human PN1 (HUM PN1A) (SEQ ID NO:13)

FIGS. 14A-D shows a representative DNA sequence encoding a human PN1 (HUM PN1B) (SEQ ID NO:14)

DETAILED DESCRIPTION OF THE INVENTION

A need exists for modulating the activity of at least one peripheral nervous system specific (PNS) sodium channel (SCs). Such modulation could potentially provide analgesic or diagnostic agents for pain or pathologies associated with nerve conduction in the PNS.

Certain sodium channels--corresponding to PNS SCPs of the invention--are now discovered to be preferentially or selectively expressed in the peripheral nervous system (PNS). These sodium channels modulate peripheral nerve impulse conduction preferentially in the PNS. The present invention provides peripheral nervous system specific (PNS) sodium channel peptides (SCPs), encoding nucleic acid, vectors, host cells and antibodies, as well as methods of making and using thereof, including recombinant expression, purification, cell-based drug screening, gene therapy, crystallization, X-ray diffraction analysis, as well as computer structure determination and rational drug design utilizing at least one PNS SCP amino acid sequence and/or x-ray diffraction data provided on computer readable media.

A PNS sodium channel peptide (PNS SCP) can refer to any subset of a PNS sodium channel (SC) having SC activity, as a fragment, consensus sequence or repeating unit. A PNS SCP of the invention can be prepared by:

(a) recombinant DNA methods;

(b) proteolytic digestion of the intact molecule or a fragment thereof;

(c) chemical peptide synthesis methods well-known in the art; and/or

(d) by any other method capable of producing a PNS SCP and having a conformation similar to an active portion of a PNS SCP and having SC activity. The SC activity can be screened according to known screening assays for sodium channel activity, in vitro, in situ or in vivo. The minimum peptide sequence to have activity is based on the smallest unit containing or comprising a particular region, domain, consensus sequence, or repeating unit thereof, of at least one PNS SCP.

According to the invention, a PNS SCP includes an association of two or more polypeptide domains, such as transmembrane, pore lining domains, or fragments thereof, corresponding to a PNS SCP, such as 1-40 domains or any range or value therein. Transmembrane, cytoplasmic pore lining or other domains of a PNS SCP of the invention may have at least 74% homology, such as 74-100% overall homology or identity, or any range or value therein to one or more corresponding SC domains as described herein (e.g., as presented FIGS. 1, 7, 8, 10 or 11). As would be understood by one of ordinary skill in the art, the above configuration of domains are provided as part of a PNS SCP of the invention, such that a functional PNS SCP, when expressed in a suitable cell, is capable of transporting sodium ions across a lipid bilayer, a cell membrane or a membrane model. In intact cells having sufficient sodium channels, the cell can be capable of generating some form of an action potential, such as in a cell expressing at least one PNS SCP of the present invention. Such transport, as measured by suitable SC activity assays, establishes SC activity of one or more PNS SCPs of the invention.

Accordingly, a PNS SCP of the invention alternatively includes peptides having a portion of a SC amino acid sequence which substantially corresponds to at least one 20 to 2005 amino acid fragment and/or consensus sequence of a PNS SCP or group of PNS SCPs, wherein the PNS SCP has homology or identity of at least 74-99%, such as 88-99% (or any range or value therein, e.g., 87-99, 88-99, 89-99, 90-99, 91-99, 92-99, 93-99, 94-99, 95-99, 96-99, 97-99, or 98-99%) homology to at least one sequence or consensis sequence of FIGS. 1, 7, 8, 10 or 11. In one aspect, such a PNS SCP can maintain SC biological activity. It is preferred that a PNS SCP of the invention is not naturally occurring or is naturally occurring but is in a purified or isolated form which does not occur in nature. Preferably, a PNS SCP of the invention substantially corresponds to an set of domains of PN1, having at least 10 contiguous amino acids of FIGS. 1, 7, 8, 10 and 11, or at least 74% homology thereto.

Alternatively or additionally, a PNS SCP of the invention may comprise at least one domain corresponding to known sodium channel domains, such as rat brain or spinal cord SC domains, such as transmembrane domains, pore lining domains, cytoplasmic domains or extracellular domains, such as IIs6 (e.g., 1-3 to 14-17 (IIs6), 18-23 to 210-214 (cytoplasmic), 229-236 to 254-258 (IIIS1), 268-272 to 293-297 (IIIs2), 300-304 to 321-325 (IIIs3), 326-330 to 347-351 (IIIs4), 368-374 to 389-393 (IIIs5), 474-478 to 560-504 (IIIs6), 553-559 to 577-583 (IVs1), 589-593 to 611-615 (IVs2), 619-623 to 642-646 (IVs3), 654-658 to 678-682 (IVs4), 690-694 to 711-715 (IVs5), 779-783 to 801-805 (IVs6), 348-352 to 368-372, 501-505 to 550-554, 233-555, 676-678 to 689-693, 554-557 to 941-945, or any range or value therein, corresponding to SEQ ID NO:2 as presented in FIGS. 7A-7D, or variants thereof as presented substitutions in Table 1 or Table 2, having 74-100% overall homology or any range or value therein. At least one of such domains are present in the PNS SCPs presented in FIGS. 11A-F, or fragments thereof, as non-limiting examples. Alternative domains are also encoded by DNA which hybridizes under stringent conditions to at least 30 contiguous nucleotides of FIGS. 1, 7, 9, 13 or 14, or having codons substituted therefor which encode the same amino acid as a particular codon. Additionally, phosphorylation (e.g., PKA and PKC) domains, as would be recognized by the those skilled in the art are also considered when providing a PNS SCP or encoding nucleic acid according to the invention.

Percent homology or identity can be determined, for example, by comparing sequence information using the GAP computer program, version 6.0, available from the University of Wisconsin Genetics Computer Group (UWGCG). The GAP program utilizes the alignment method of Needleman and Wunsch (J. Mol. Biol. 48:443 (1970), as revised by Smith and Waterman (i Adv. Appl. Math. 2:482 (1981). Briefly, the GAP program defines similarity as the number of aligned symbols (i.e., nucleotides or amino acids) which are similar, divided by the total number of symbols in the shorter of the two sequences. The preferred default parameters for the GAP program include: (1) a unitary comparison matrix (containing a value of 1 for identities and 0 for non-identities) and the weighted comparison matrix of Gribskov and Burgess, Nucl. Acids Res. 14:6745 (1986), as described by Schwartz and Dayhoff, eds., ATLAS OF PROTEIN SEQUENCE AND STRUCTURE, National Biomedical Research Foundation, pp. 353-358 (1979); (2) a penalty of 3.0 for each gap and an additional 0.10 penalty for each symbol in each gap; and (3) no penalty for end gaps. In a preferred embodiment, the peptide of the invention corresponds to a SC biologically active portion of SEQ ID NO:2, or variant thereof, e.g., as presented in FIGS. 11A-F.

Thus, one of ordinary skill in the art, given the teachings and guidance presented in the present specification, will know how to add, delete or substitute other amino acid residues in other positions of a SC to obtain a PNS SCP, including substituted, deletional or additional variants, e.g., with a substitution as presented in Tables 1 or 2 below.

A PNS SCP of the invention also includes a variant wherein at least one amino acid residue in the peptide has been conservatively replaced, added or deleted by at least one different amino acid. For a detailed description of protein chemistry and structure, See, e.g., Schulz, et al., Principles of Protein Structure, Springer-Verlag, New York, 1978, and Creighton, T. E., Proteins: Structure and Molecular Properties, W.H. Freeman & Co., San Francisco, 1983, which are hereby incorporated by reference. For a presentation of nucleotide sequence substitutions, such as codon preferences, see Ausubel et al, eds, Current Protocols in Molecular Biology, Greene Publishing Assoc., New York, N.Y. (1987, 1992, 1993, 1994, 1995) at §§ A.1.1-A.1.24, and Sambrook et al, Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Press, Cold Spring Harbor, N.Y. (1989), at Appendices C and D.

Conservative substitutions of a PNS SCP of the invention includes a variant wherein at least one amino acid residue in the peptide has been conservatively replaced, added or deleted by at least one different amino acid. Such substitutions preferably are made in accordance with the following list as presented in Table 1, which substitutions can be determined by routine experimentation to provide modified structural and functional properties of a synthesized peptide molecule, while maintaining SC biological activity, as determined by known SC activity assays. In the context of the invention, the term PNS SCP or "substantially corresponding to" includes such substitutions.

                  TABLE 1                                                          ______________________________________                                                Original    Exemplary                                                          Residue     Substitution                                                ______________________________________                                                Ala         Gly; Ser                                                           Arg         Lys                                                                Asn         Gln; His                                                           Asp         Glu                                                                Cys         Ser                                                                Gln         Asn                                                                Glu         Asp                                                                Gly         Ala; Pro                                                           His         Asn; Gln                                                           Ile         Leu; Val                                                           Leu         Ile; Val                                                           Lys         Arg; Gln; Glu                                                      Met         Leu; Tyr; Ile                                                      Phe         Met; Leu; Tyr                                                      Ser         Thr                                                                Thr         Ser                                                                Trp         Tyr                                                                Tyr         Trp; Phe                                                           Val         Ile; Leu                                                    ______________________________________                                    

Alternatively, another group of substitutions of PNS SCPs of the invention are those in which at least one amino acid residue in the protein molecule has been removed and a different residue added in its place according to the following Table 2. The types of substitutions which can be made in the protein or peptide molecule of the invention can be based on analysis of the frequencies of amino acid changes between a homologous protein of different species, such as those presented in Table 1-2 of Schulz et al., infra. Based on such an analysis, alternative conservative substitutions are defined herein as exchanges within one of the following five groups:

TABLE 2

1. Small aliphatic, nonpolar or slightly polar residues: Ala Ser, Thr (Pro, Gly);

2. Polar, negatively charged residues and their amnides: Asp. Asn, Glu, Gln;

3. Polar, positively charged residues:

His, Arg, Lys;

4. Large aliphatic, nonpolar residues:

Met, Leu, Ile, Val (Cys); and

5. Large aromatic residues: Phe, Tyr, Trp.

Most deletions and additions, and substitutions according to the invention are those which do not produce radical changes in the characteristics of the protein or peptide molecule. "Characteristics" is defined in a non-inclusive manner to define both changes in secondary structure, e.g. α-helix or β-sheet, as well as changes in physiological activity, e.g. in receptor binding assays.

Accordingly, based on the above examples of specific substitutions, alternative substitutions can be made by routine experimentation, to provide alternative PNS SCPs of the invention, e.g., by making one or more conservative substitutions of SC fragments which provide SC activity. However, when the exact effect of the substitution, deletion, or addition is to be confirmed, one skilled in the art will appreciate that the effect of at least one substitution, addition or deletion will be evaluated by at least one sodium channel activity screening assay, such as, but not limited to, immunoassays or bioassays, to confirm biological activity, such as, but not limited to, sodium channel activity.

Amino acid sequence variants of a PNS SCP of the invention can also be prepared by mutations in the DNA. Such variants include, for example, deletions from, or additions or substitutions of, residues within the amino acid sequence. Any combination of deletion, addition, and substitution can also be made to arrive at the final construct, provided that the final construct possesses some SC activity. Preferably improved SC activity is found over that of the non-variant peptide. Obviously, mutations that will be made in the DNA encoding the variant must not place the sequence out of reading frame and preferably will not create complementary regions that could produce secondary mRNA structure (see, e.g., EP Patent Application Publication No. 75,444; Ausubel, infra; Sambrook, infra). At the genetic level, these variants ordinarily are prepared by site-directed mutagenesis of nucleotides in the DNA encoding a PNS SCP, thereby producing DNA encoding the variant, and thereafter expressing the DNA in recombinant cell culture. The variants typically exhibit the same qualitative biological activity as the naturally occurring SC (see, e.g., Ausubel, infra; Sambrook, infra).

Once a PNS sodium channel structure or characteristics have been determined, PNS SCPs can be recombinantly or synthetically produced, or optionally purified, to provide commercially useful amounts of PNS SCPs for use in diagnostic or research applications, according to known method steps (see, e.g., Ausubel, infra, and Sambrook, infra, which references are herein entirely incorporated by reference).

A variety of methodologies known in the art can be utilized to obtain an isolated PNS SCP of the invention. In one embodiment, the peptide is purified from tissues or cells which naturally produce the peptide. Alternatively, the above-described isolated nucleic acid fragments could be used to expressed the PNS SCP protein in any organism. The samples of the invention include cells, protein extracts or membrane extracts of cells, or biological fluids. The sample will vary based on the assay format, the detection method and the nature of the tissues, cells or extracts used as the sample.

The cells and/or tissue can include, e.g., normal or pathologic animal cells or tissues, such as the peripheral nervous system, and extracts or cell cultures thereof, provided in vivo, in situ or in vitro, as cultured, passaged, non-passaged, transformed, recombinant, or isolated cells and/or tissues.

Any higher eukaryotic organism can be used as a source of at least one or PNS SCP of the invention, as long as the source organism naturally contains such a peptide. As used herein, "source organism" refers to the original organism from which the amino acid sequence of the peptide is derived, regardless of the organism the peptide is expressed in and/or ultimately isolated from. Preferred organisms as sources of at least one PNS SCP or encoding nucleic acid can be any vertebrate animal, such as mammals, birds, bony fish, electric eels, frogs and toads. Among mammals, the preferred recipients are mammals of the Orders Primata (including humans, apes and monkeys), Arteriodactyla (including horses, goats, cows, sheep, pigs), Rodenta (including mice, rats, rabbits, and hamsters), and Carnivora (including cats, and dogs). The most preferred source organisms are humans.

One skilled in the art can readily follow known methods for isolating proteins in order to obtain the peptide free of natural contaminants. These include, but are not limited to: immunochromotography, size-exclusion chromatography, HPLC, ion-exchange chromatography, and immunoaffinity chromatography. See, e.g., Ausubel, infra; Sambrook, infra; Colligan, infra.

Isolated Nucleic Acid Molecules Coding for PNS SCP Peptides In one embodiment, the present invention relates to an isolated nucleic acid molecule coding for a peptide having an amino acid sequence corresponding to novel PNS SCPs. In one preferred embodiment, the isolated nucleic acid molecule comprises a PNS SCP nucleotide sequence with greater than 70% overall identity or homology to at least a 60 nucleotide sequence present in SEQ ID NO:1 (preferably greater than 80%; more preferably greater than 90%, such as 70-99% any range or value therein). In another preferred embodiment, the isolated nucleic acid molecule comprises a PNS SCP nucleotide sequence corresponding to FIGS. 1, 7 or 9, or encoding at least one domain of FIGS. 1, 7, 8, 10 and 11.

Also included within the scope of this invention are the functional equivalents of the herein-described isolated nucleic acid molecules and derivatives thereof. For example, as presented above for PNS SCP amino acid sequences, the nucleic acid sequences depicted in SEQ ID NO:1 can be altered by substitutions, additions or deletions that provide for functionally equivalent molecules. Due to the degeneracy of nucleotide coding sequences, other DNA sequences which encode substantially the same amino acid sequence of a PNS SCP can be used in the practice of the invention. These include but are not limited to amino acid sequences encoding all or portions of PNS SCP amino acid sequence of FIGS. 1, 8, 10 and 11, which are altered by the substitution of different codons that encode a functionally equivalent amino acid residue within the sequence, thus producing a silent change.

Such functional alterations of a given nucleic acid sequence afford an opportunity to promote secretion and/or processing of heterologous proteins encoded by foreign nucleic acid sequences fused thereto. All variations of the nucleotide sequence of the PNS SCP gene and fragments thereof permitted by the genetic code are, therefore, included in this invention. See, e.g., Ausubel, infra; Sambrook, infra.

In addition, the nucleic acid sequence can comprise a nucleotide sequence which results from the addition, deletion or substitution of at least one nucleotide to the 5'-end and/or the 3'-end of a nucleic acid sequence corresponding to FIGS. 1, 7 or 9, or encoding at least a portion of FIGS. 1, 8, 10 or 11, or a variant thereof. Any nucleotide or polynucleotide can be used in this regard, provided that its addition, deletion or substitution does remove the sodium channel activity which is encoded by the nucleotide sequence. Moreover, the nucleic acid molecule of the invention can, as necessary, have restriction endonuclease recognition sites which do not remove the activity of the encoded PNS SCP.

Further, it is possible to delete codons or to substitute one or more codons by codons other than degenerate codons to produce a structurally modified peptide, but one which has substantially the same utility or activity of the peptide produced by the unmodified nucleic acid molecule. As recognized in the art, the two peptides are functionally equivalent, as are the two nucleic acid molecules which give rise to their production, even though the differences between the nucleic acid molecules are not related to degeneracy of the genetic code. See, e.g., Ausubel, infra; Sambrook, infra.

Isolation of Nucleic Acid In another aspect of the present invention, isolated nucleic acid molecules coding for peptides having amino acid sequences corresponding to PNS SCP are provided. In particular, the nucleic acid molecule can be isolated from a biological sample containing mammalian nucleic acid, as corresponding to a probe specific for a PNS SC obtained from a higher eukaryotic organism.

The nucleic acid molecule can be isolated from a biological sample containing nucleic acid using known techniques, such as but not limited to, primer amplification or cDNA cloning.

The nucleic acid molecule can be isolated from a biological sample containing genomic DNA or from a genomic library. Suitable biological samples include, but are not limited to, normal or pathologic animal cells or tissues, such as cerebrospinal fluid (CNS), peripheral nervous system (neurons, ganglion) and portions, cells of heart, smooth, skeletal or cardiac muscle, autonomic nervous system, and extracts or cell cultures thereof, provided in vivo, in situ or in vitro, as cultured, passaged, non-passaged, transformed, recombinant, or isolated cells and/or tissues. The method of obtaining the biological sample will vary depending upon the nature of the sample.

One skilled in the art will realize that a mammalian genome can be subject to slight allelic variations between individuals. Therefore, the isolated nucleic acid molecule is also intended to include allelic variations, so long as the sequence encodes a PNS SCP. When a PNS SCP allele does not encode the identical amino acid sequence to that found in FIGS. 1, 8, 10 or 11, or at least domain thereof, it can be isolated and identified as PNS SCP using the same techniques used herein, and especially nucleic acid amplification techniques to amplify the appropriate gene with primers based on the sequences disclosed herein. Such variations are presented, e.g., in FIG. 11 and in Tables 1 and 2.

The cloning of large cDNAs is the same (e.g., PN1 as a PNS SCP of the invention includes overlapping clones of about 13 kDa) but takes more routine experimentation, than smaller cDNAs. One useful method relies on cDNA bacteriophage library screening (see, e.g., Sambrook, infra, or Ausubel, infra). Probes for the screening are labeled, e.g., with random hexamers and Klenow enzyme (Pharmacia kit). If 5' cDNAs are not obtained with these approaches, a subcDNA library can be prepared in which a specific PN1 primers are used to prime the reverse transcript reaction in place of oligo dT or random primers. The cDNA sublibrary is then cloned into standard vectors such as lambda zap and screened using conventional techniques. This strategy was used previously (Noda et al. Nature 320:188-192 (1986); Noda et al., Nature 322:826-828 (1986)) to clone the brain type I and II sodium channel cDNAs. The construction of a full-length cDNA is performed by subcloning overlapping fragments into an expression vector (either prokaryotic or eukaryotic). This task is more difficult with large cDNAs because of the paucity of unique restriction sites, but routine restriction, cloning or PCR is used to join the fragments.

Synthesis of Nucleic Acid Isolated nucleic acid molecules of the present invention are also meant to include those chemically synthesized. For example, a nucleic acid molecule with the nucleotide sequence which codes for the expression product of a PNS SCP gene can be designed and, if necessary, divided into appropriate smaller fragments. Then an oligomer which corresponds to the nucleic acid molecule, or to each of the divided fragments, can be synthesized (e.g., of 10-6015 nucleotides or any range or value therein, such as 10-100 nucleotides). Such synthetic oligonucleotides can be prepared, for example, by known techniques (See, e.g., Ausubel, infra, or Sambrook, infra) or by using an automated DNA synthesizer.

A labeled oligonucleotide probe be derived synthetically or by cloning. If necessary, the 5'-ends of the oligomers can be phosphorylated using T4 polynucleotide kinase. Kinasing of single strands prior to annealing or for labeling can be achieved using an excess of the enzyme. If kinasing is for the labeling of probe, the ATP can contain high specific activity radioisotopes. Then, the DNA oligomer can be subjected to annealing and ligation with T4 ligase or the like.

A Nucleic Acid Probe for the Specific Detection of PNS SCP In another embodiment, the present invention relates to a nucleic acid probe of 15-6000 nucleotides for the specific detection of the presence of PNS SCP in a sample comprising the above-described nucleic acid molecules or at least a fragment thereof which binds under stringent conditions to a nucleic acid encoding at least one PNS SCP.

The nucleic acid probe can be used to screen an appropriate chromosomal or cDNA library by known hybridization method steps to obtain a PNS SCP encoding nucleic acid molecule of the invention. A chromosomal DNA or cDNA library can be prepared from appropriate cells according to recognized methods in the art (See, e.g., Ausubel, infra; Sambrook, infra).

In the alternative, organic chemical synthesis is carried out in order to obtain nucleic acid probes having nucleotide sequences which correspond to suitable portions of the amino acid sequence of the PNS SCP. Thus, the synthesized nucleic acid probes can be used as primers in nucleic acid amplification method steps.

The invention can thus provide methods for amplification of DNA and/or RNA using heat stable, cross-linked nucleotide primers, which cross linked primers of the invention to provide nucleic acid encoding PNS SCPs according to the invention.

Methods of amplification of RNA or DNA are well known in the art and can be used according to the invention without undue experimentation, based on the teaching and guidance presented herein. According to the invention, the use of nucleic acids encoding portions of PNS SCPs according to the invention, as amplification primers, allows for advantages over known amplification primers, due to the increase in sensitivity, selectivity and/or rate of amplification.

Known methods of DNA or RNA amplification include, but are not limited to polymerase chain reaction (PCR) and related amplification processes (see, e.g., U.S. Pat. Nos. 4,683,195, 4,683,202,4,800,159, 4,965,188, to Mullis et al.; 4,795,699 and 4,921,794 to Tabor et al; 5,142,033 to Innis; 5,122,464 to Wilson et al.; 5,091,310 to Innis; 5,066,584 to Gyllensten et al.; 4,889,818 to Gelfand et al.; 4,994,370 to Silver et al; 4,766,067 to Biswas; 4,656,134 to Ringold; 5,340,728 to Grosz et al.; 5,322,770 to Gelfand et al; 5,338,671 to Scalice et al; PCT WO 92/06200 to Cetus Corp.; PCT WO 94/14978 to Strack el al, which patent disclosures are entirely incorporated herein by reference) and RNA mediated amplification which uses antisense RNA to the target sequence as a template for double stranded DNA synthesis (U.S. Pat. No. 5,130,238 to Malek et al., with the tradeneame NASBA), the entire contents of which patents and references are herein entirely incorporated by reference. Reviews of the PCR are provided by Mullis (Cold Spring Harbor Symp. Quant. Biol. 51:263-273 (1986)); Saiki et al. (Bio/Technology 3:1008-1012 (1985)); and Mullis et al. (Meth. Enzymol. 155:335-350 (1987)). One skilled in the art can readily design such probes based on the sequence disclosed herein using methods such as computer alignment and sequence analysis known in the art. See, e.g. Ausubel, infra; Sambrook, infra.

The hybridization probes of the invention can be labeled by standard labeling techniques such as with a radiolabel, enzyme label, fluorescent label, biotin-avidin label, chemiluminescence, and any other known and suitable labels. After hybridization, the probes can be visualized using known methods. The nucleic acid probes of the invention include RNA, as well as DNA probes, such probes being generated using techniques known in the art (See, e.g., Ausubel, infra; Sambrook, infra). In one embodiment of the above described method, a nucleic acid probe is immobilized on a solid support. Examples of such solid supports include, but are not limited to, plastics such as polycarbonate, complex carbohydrates such as agarose and SEPHAROSE, and acrylic resins, such as polyacrylamide and latex beads. Techniques for coupling nucleic acid probes to such solid supports are well known in the art (See, e.g., Ausubel, infra; Sambrook, infra).

The test samples suitable for nucleic acid probing methods of the invention include, for example, cells or nucleic acid extracts of cells, or biological fluids. The sample used in the above-described methods will vary based on the assay format, the detection method and the nature of the tissues, cells or extracts to be assayed. Methods for preparing nucleic acid extracts of cells are well known in the art and can be readily adapted in order to obtain a sample which is compatible with the method utilized.

Methods for Detecting The Presence of PNS SCP Encoding Nucleic Acid in a Biological Sample. In another embodiment, the present invention relates to methods for detecting the presence of PNS SCP encoding nucleic acid in a sample. Such methods can comprise (a) contacting the sample with the above-described nucleic acid probe, under conditions such that hybridization occurs, and (b) detecting the presence of a labeled probe bound to the nucleic acid probe. One skilled in the art can select a suitable, labeled nucleic acid probe according to techniques known in the art as described above. Samples to be tested include, but are not limited to, RNA samples of mammalian tissue.

PNS SCP has been found to be expressed in peripheral nerve and dorsal root ganglion cells. Accordingly, PNS SCP probes can be used detect the presence of RNA from PN cells in such a biological sample. Further, altered expression levels of PNS SCP RNA in an individual, as compared to normal levels, can indicate the presence of disease. The PNS SCP probes can further be used to assay cellular activity in general and specifically in peripheral nervous system tissue.

A Kit for Detecting the Presence of PNS SCP in a Sample. In another embodiment, the present invention relates to a kit for detecting the presence of PNS SCP in a sample comprising at least one container having disposed therein the above-described nucleic acid probe. In a preferred embodiment, the kit further comprises other containers comprising one or more of the following: wash reagents and reagents capable of detecting the presence of bound nucleic acid probe. Examples of detection reagents include, but are not limited to radiolabeled probes, enzymatic labeled probes (horse radish peroxidase, alkaline phosphatase), and affinity labeled probes (biotin, avidin, or steptavidin) (See, e.g., Ausubel, infra; Sambrook, infra).

A compartmentalized kit includes any kit in which reagents are contained in separate containers. Such containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allow the efficient transfer of reagents from one compartment to another compartment such that the samples and reagents are not cross-contaminated and the agents or solutions of each container can be added in a quantitative fashion from one compartment to another. Such containers will include a container which will accept the test sample, a container which contains the probe or primers used in the assay, containers which contain wash reagents (such as phosphate buffered saline, TRIS-buffers, and the like), and containers which contain the reagents used to detect the hybridized probe, bound antibody, amplified product, or the like.

One skilled in the art will readily recognize that the nucleic acid probes described in the invention can readily be incorporated into one of the established kit formats which are well known in the art.

DNA Constructs Comprising a PNS SCP Nucleic Acid Molecule and Hosts Containing These Constructs. A nucleic acid sequence encoding an PNS SCP of the invention can be recombined with vector DNA in accordance with conventional techniques, including blunt-ended or staggered-ended termini for ligation, restriction enzyme digestion to provide appropriate termini, filling in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and ligation with appropriate ligases. Techniques for such manipulations are disclosed, e.g., by Ausubel el al., infra, and are well known in the art.

A nucleic acid molecule, such as DNA, is said to be "capable of expressing" a polypeptide if it contains nucleotide sequences which contain transcriptional and translational regulatory information and such sequences are "operably linked" to nucleotide sequences which encode the polypeptide. An operable linkage is a linkage in which the regulatory DNA sequences and the DNA sequence sought to be expressed are connected in such a way as to permit gene expression as PNS SCPs or Ab fragments in recoverable amounts. The precise nature of the regulatory regions needed for gene expression can vary from organism to organism, as is well known in the analogous art. See, e.g., Sambrook, infra and Ausubel infra.

The invention accordingly encompasses the expression of an PNS SCP, in either prokaryotic or eukaryotic cells, although eukaryotic expression is preferred.

Preferred hosts are bacterial or eukaryotic hosts including bacteria, yeast, insects, fungi, bird and mammalian cells either in vivo, or in situ, or host cells of mammalian, insect, bird or yeast origin. It is preferred that the mammalian cell or tissue is of human, primate, hamster, rabbit, rodent, cow, pig, sheep, horse, goat, dog or cat origin, but any other mammalian cell can be used.

Eukaryotic hosts can include yeast, insects, fungi, and mammalian cells either in vivo, or in tissue culture. Preferred eukaryotic hosts can also include, but are not limited to insect cells, mammalian cells either in vivo, or in tissue culture. Preferred mammalian cells include Xenopus oocytes, HeLa cells, cells of fibroblast origin such as VERO or CHO-K1, or cells of lymphoid origin and their derivatives.

Mammalian cells provide post-translational modifications to protein molecules including correct folding or glycosylation at correct sites. Mammalian cells which can be useful as hosts include cells of fibroblast origin such as, but not limited to, NIH 3T3, VERO or CHO, or cells of lymphoid origin, such as, but not limited to, the hybridoma SP2/O-Ag14 or the murine myeloma P3-X63Ag8, hamster cell lines (e.g., CHO-K1 and progenitors, e.g., CHO-DUXB11) and their derivatives. One preferred type of mammalian cells are cells which are intended to replace the function of the genetically deficient cells in vivo. Neuronally derived cells are preferred for gene therapy of disorders of the nervous system. For a mammalian cell host, many possible vector systems are available for the expression of at least one PNS SCP. A wide variety of transcriptional and translational regulatory sequences can be employed, depending upon the nature of the host. The transcriptional and translational regulatory signals can be derived from viral sources, such as, but not limited to, adenovirus, bovine papilloma virus, Simian virus, or the like, where the regulatory signals are associated with a particular gene which has a high level of expression. Alternatively, promoters from mammalian expression products, such as, but not limited to, actin, collagen, myosin, protein production. See, Ausubel, infra,; Sanbrook, infra.

When live insects are to be used, silk moth caterpillars and baculoviral vectors are presently preferred hosts for large scale PNS SCP production according to the invention. Production of PNS SCPs in insects can be achieved, for example, by infecting the insect host with a baculovirus engineered to express at least one PNS SCP by methods known to those skilled in the related arts. See Ausubel et al, eds. Current Protocols in Molecular Biology, Wiley Interscience, §§16.8-16.11 (1987, 1992, 1993, 1994).

In a preferred embodiment, the introduced nucleotide sequence will be incorporated into a plasmid or viral vector capable of autonomous replication in the recipient host. Any of a wide variety of vectors can be employed for this purpose. See, e.g., Ausubel et al., infra, §§ 1.5, 1.10, 7.1, 7.3, 8.1, 9.6, 9.7, 13.4, 16.2, 16.6, and 16.8-16.11. Factors of importance in selecting a particular plasmid or viral vector include: the ease with which recipient cells that contain the vector can be recognized and selected from those recipient cells which do not contain the vector; the number of copies of the vector which are desired in a particular host; and whether it is desirable to be able to "shuttle" the vector between host cells of different species.

Different host cells have characteristic and specific mechanisms for the translational and post-translational processing and modification (e.g., glycosylation, cleavage) of proteins. Appropriate cell lines or host systems can be chosen to ensure the desired modification and processing of the foreign protein expressed. For example, expression in a bacterial system can be used to produce an unglycosylated core protein product. Expression in yeast will produce a glycosylated product. Expression in mammalian cells can be used to ensure "native" glycosylation of the heterologous PNS SCP protein. Furthermore, different vector/host expression systems can effect processing reactions such as proteolytic cleavages to different extents.

As discussed above, expression of PNS SCP in eukaryotic hosts requires the use of eukaryotic regulatory regions. Such regions will, in general, include a promoter region sufficient to direct the initiation of RNA synthesis. See, e.g., Ausubel, infra; Sambrook, infra.

Once the vector or nucleic acid molecule containing the construct(s) has been prepared for expression, the DNA construct(s) can be introduced into an appropriate host cell by any of a variety of suitable means, i.e., transformation, transfection, conjugation, protoplast fusion, electroporation, particle gun technology, calcium phosphate-precipitation, direct microinjection, and the like. After the introduction of the vector, recipient cells are grown in a selective medium, which selects for the growth of vector-containing cells. Expression of the cloned gene molecule(s) results in the production of at least one PNS SCP. This can take place in the transformed cells as such, or following the induction of these cells to differentiate (for example, by administration of bromodeoxyuracil to neuroblastoma cells or the like).

Isolation of PNS SCP. The PNS SCP proteins or fragments of this invention can be obtained by expression from recombinant DNA as described above. Alternatively, a PNS SCP can be purified from biological material. If so desired, the expressed at least one PNS SCP can be isolated and purified in accordance with conventional method steps, such as extraction, precipitation, chromatography, affinity chromatography, electrophoresis, or the like. For example, cells expressing at least one PNS SCP in suitable levels can be collected by centrifugation, or with suitable buffers, lysed, and the protein isolated by column chromatography, for example, on DEAE-cellulose, phosphocellulose, polyribocytidylic acid-agarose, hydroxyapatite or by electrophoresis or immunoprecipitation. Alternatively, PNS SCPs can be isolated by the use of specific antibodies, such as, but not limited to, an PNS SCP or SC antibody. Such antibodies can be obtained by known method steps (see, e.g. Colligan, infra; Ausubel, infra.

For purposes of the invention, one method of purification which is illustrative, without being limiting, consists of the following steps. A first step in the purification of a PNS SCP includes extraction of the PNS SCP fraction from a biological sample, such as peripheral nerve tissue or dorsal root ganglia (DRG), in buffers, with or without solubilizing agents such as urea, formic acid, detergent, or thiocyanate. A second step includes subjecting the solubilized material to ion-exchange chromatography on Mono-Q or Mono-S columns (Pharmacia LKB Biotechnology, Inc; Piscataway, N.J.). Similarly, the solubilized material can be separated by any other process wherein molecules can be separated according to charge density, charge distribution and molecular size, for example. Elution of the PNS SCP from the ion-exchange resin are monitored by an immunoassay, such as M-IRMA, on each fraction. Immunoreactive peaks would are then dialyzed, lyophilized, and subjected to molecular sieve, or gel chromatography. In a third step, molecular sieve or gel chromatography is a type of partition chromatography in which separation is based on molecular size. Dextran, polyacrylamide, and agarose gels are commonly used for this type of separation. One useful gel for the invention is SEPHAROSE 12 (Pharmacia LKB Biotechnology, Inc.). However, other methods, known to those of skill in the art can be used to effectively separate molecules based on size. A fourth step in a purification protocol for a PNS SCP can include analyzing the immunoreactive peaks by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), a further gel chromatographic purification step, and staining, such as, for example, silver staining. A fifth step in a purification method can include subjecting the PNS SCP obtained after SDS-PAGE to affinity chromatography, or any other procedure based upon affinity between a substance to be isolated and a molecule to which it can specifically bind. For further purification of a PNS SCP, affinity chromatography on SEPHAROSE conjugated to anti-PNS SCP mAbs (specific mABs generated against substantially pure PNS SCP) can be used. Alternative methods, such as reverse-phase HPLC, or any other method characterized by rapid separation with good peak resolution are useful.

It will be appreciated that other purification steps can be substituted for the preferred method described above. Those of skill in the art will be able to devise alternate purification schemes without undue experimentation.

An Antibody Having Binding Affinity to a PNS SCP Peptide and a Hybridoma Containing the Antibody. In another embodiment, the invention relates to an antibody having binding affinity specifically to a PNS SCP peptide as described above or fragment thereof. Those which bind selectively to PNS SCP would be chosen for use in methods which could include, but should not be limited to, the analysis of altered PNS SCP expression in tissue containing PNS SCP.

The PNS SCP proteins of the invention can be used in a variety of procedures and methods, such as for the generation of antibodies, for use in identifying pharmaceutical compositions, and for studying DNA/protein interaction.

The PNS SCP peptide of the invention can be used to produce antibodies or hybridomas. One skilled in the art will recognize that if an antibody is desired, such a peptide would be generated as described herein and used as an immunogen.

The antibodies of the invention include monoclonal and polyclonal antibodies, as well as fragments of these antibodies. The invention further includes single chain antibodies. Antibody fragments which contain the idiotype of the molecule can be generated by known techniques.

The term "antibody" is meant to include polyclonal antibodies, monoclonal antibodies (mAbs), chimeric antibodies, anti-idiotypic (anti-Id) antibodies to antibodies that can be labeled in soluble or bound form, as well as fragments thereof provided by any known technique, such as, but not limited to enzymatic cleavage, peptide synthesis or recombinant techniques. Polyclonal antibodies are heterogeneous populations of antibody molecules derived from the sera of animals immunized with an antigen. A monoclonal antibody contains a substantially homogeneous population of antibodies specific to antigens, which population contains substantially similar epitope binding sites. MAbs can be obtained by methods known to those skilled in the art. See, e.g., Kohler and Milstein, Nature 256:495-497 (1975); U.S. Pat. No. 4,376,110; Ausubel et al, eds., CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, Greene Publishing Assoc. and Wiley Interscience, N.Y., (1987, 1992); and Harlow and Lane ANTIBODIES: A LABORATORY MANUAL Cold Spring Harbor Laboratory (1988); Colligan et al., eds., Current Protocols in Immunology, Greene Publishing Assoc. and Wiley Interscience, N.Y., (1992, 1993), the contents of which references are incorporated entirely herein by reference. Such antibodies can be of any immunoglobulin class including IgG, IgM, IgE, IgA, GILD and any subclass thereof. A hybridoma producing a mAb of the invention can be cultivated in vitro, in situ or in vivo. Production of high titers of mAbs in vivo or in situ makes this the presently preferred method of production.

Chimeric antibodies are molecules different portions of which are derived from different animal species, such as those having variable region derived from a murine mAb and a human immunoglobulin constant region, which are primarily used to reduce immunogenicity in application and to increase yields in production, for example, where murine mAbs have higher yields from hybridomas but higher immunogenicity in humans, such that human/murine chimeric mAbs are used. Chimeric antibodies and methods for their production are known in the art (Cabilly et al, Proc. Natl. Acad Sci. USA 81:3273-3277 (1984); Morrison et al., Proc. Natl. Acad Sci. USA 81:6851-6855 (1984); Boulianne et al., Nature 312:643-646 (1984); Cabilly et al., European Patent Application 125023; Neuberger et al., Nature 314:268-270 (1985); Taniguchi et al, European Patent Application 171 496; Morrison et al., European Patent Application 173 494; Neuberger el al., PCT Application WO 86/01533; Kudo et al, European Patent Application 184 187; Morrison et al, European Patent Application 173 494; Sahagan et al., J. Immunol. 137:1066-1074 (1986); Robinson et al., International Patent Publication No. PCT/US 86/02269; Liu et al., Proc. Natl. Acad. Sci. USA 84:3439-3443 (1987); Sun et al., Proc. Natl. Acad. Sci. USA 84:214-218 (1987); Better et al., Science 240:1041-1043 (1988); and Harlow, infra. These references are entirely incorporated herein by reference.

An anti-idiotypic (anti-id) antibody is an antibody which recognizes unique determinants generally associated with the antigen-binding site of an antibody. An Id antibody can be prepared by immunizing an animal of the same species and genetic type (e.g., mouse strain) as the source of the mAb with the mAb to which an anti-Id is being prepared. The immunized animal will recognize and respond to the idiotypic determinants of the immunizing antibody by producing an antibody to these idiotypic determinants (the anti-Id antibody). See, for example, U.S. Pat. No. 4,699,880, which is herein entirely incorporated by reference.

The anti-Id antibody can also be used as an "immunogen" to induce an immune response in yet another animal, producing a so-called anti-anti-Id antibody. The anti-anti-ld can be epitopically identical to the original mAb which induced the anti-Id. Thus, by using antibodies to the idiotypic determinants of a mAb, it is possible to identify other clones expressing antibodies of identical specificity.

Accordingly, mAbs generated against a PNS SCP of the invention can be used to induce anti-id antibodies in suitable animals, such as BALB/c mice. Spleen cells from such immunized mice are used to produce anti-Id hybridomas secreting anti-Id mAbs. Further, the anti-id mAbs can be coupled to a carrier such as keyhole limpet hemocyanin (KLH) and used to immunize additional BALB/c mice. Sera from these mice will contain anti-anti-Id antibodies that have the binding properties of the original mAb specific for a PNS SCP specific epitope. The anti-id mAbs thus have their own idiotypic epitopes, or "idiotopes" structurally similar to the epitope being evaluated.

The term "antibody" is also meant to include both intact molecules as well as fragments thereof, such as, for example, Fab and F(ab')₂, which are capable of binding antigen. Fab and F(ab')₂ fragments lack the Fc fragment of intact antibody, clear more rapidly from the circulation, and can have less non-specific tissue binding than an intact antibody (Wahl et al., J. Nucl. Med. 24:316-325 (1983)). It will be appreciated that Fab and F(ab')₂ and other fragments of the antibodies useful in the invention can be used for the detection and/or quantitation of a PNS SCP according to the methods disclosed herein for intact antibody molecules. Such fragments are typically produced by proteolytic cleavage, using enzymes such as papain (to produce Fab fragments) or pepsin (to produce F(ab')₂ fragments). An antibody is said to be "capable of binding" a molecule if it is capable of specifically reacting with the molecule to thereby bind the molecule to the antibody. The term "epitope" is meant to refer to that portion of any molecule capable of being bound by an antibody which can also be recognized by that antibody. Epitopes or "antigenic determinants" usually consist of chemically active surface groupings of molecules such as amino acids or sugar side chains and have specific three dimensional structural characteristics as well as specific charge characteristics.

An "antigen" is a molecule or a portion of a molecule capable of being bound by an antibody which is additionally capable of inducing an animal to produce antibody capable of binding to an epitope of that antigen. An antigen can have one, or more than one epitope. The specific reaction referred to above is meant to indicate that the antigen will react, in a highly selective manner, with its corresponding antibody and not with the multitude of other antibodies which can be evoked by other antigens.

Immunoassays. Antibodies of the invention, directed against a PNS SCP, can be used to detect or diagnose a PNS SC or a PNS SC-related pathologies. Screening methods are provided by the invention can include, e.g., immunoassays employing radioimmunoassay (RIA) or enzyme-linked immunosorbant assay (ELISA) methodologies, based on the production of specific antibodies (monoclonal or polyclonal) to a PNS SCP. For these assays, biological samples are obtained by, nerve biopsy, or other peripheral nervous system tissue sampling. For example, in one form of RIA, the substance under test is mixed with diluted antiserum in the presence of radiolabeled antigen. In this method, the concentration of the test substance will be inversely proportional to the amount of labeled antigen bound to the specific antibody and directly related to the amount of free labeled antigen. Other suitable screening methods will be readily apparent to those of skill in the art.

Furthermore, one skilled in the art can readily adapt currently available procedures, as well as the techniques, methods and kits disclosed above with regard to antibodies, to generate peptides capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, for example see Hurby et al., "Application of Synthetic Peptides: Antisense Peptides", In: Synthetic Peptides, A User's Guide, W.H. Freeman, NY, pp. 289-307 (1992), and Kaspczak et al., Biochemistry 28:9230-8 (1989).

One embodiment for carrying out the diagnostic assay of the invention on a biological sample containing a PNS SCP, comprises:

(a) contacting a detectably labeled PNS SCP-specific antibody with a solid support to effect immobilization of said PNS SCP-specific antibody or a fragment thereof;

(b) contacting a sample suspected of containing a PNS SCP with said solid support;

(c) incubating said detectably labeled PNS SCP-specific antibody with said support for a time sufficient to allow the immobilized PNS SCP-specific antibody to bind to the PNS SCP;

(d) separating the solid phase support from the incubation mixture obtained in step (c); and

(e) detecting the bound label and thereby detecting and quantifying PNS SCP.

The specific concentrations of detectably labeled antibody and PNS SCP, the temperature and time of incubation, as well as other assay conditions can be varied, depending on various factors including the concentration of a PNS SCP in the sample, the nature of the sample, and the like. The binding activity of a given lot of anti-PNS SCP antibody can be determined according to well known methods. Those skilled in the ant will be able to determine operative and optimal assay conditions for each determination by employing routine experimentation. Other such steps as washing, stirring, shaking, filtering and the like can be added to the assays as is customary or necessary for the particular situation.

Detection can be accomplished using any of a variety of assays. For example, by radioactively labeling the PNS SCP-specific antibodies or antibody fragments, it is possible to detect PNS SCP through the use of radioimmune assays. A good description of a radioimmune assay can be found in Colligan, infra, and Ausubel, infra, entirely incorporated by reference herein. Preferably, the detection of cells which express a PNS SCP can be accomplished by in vivo imaging techniques, in which the labeled antibodies (or fragments thereof) are provided to a subject, and the presence of the PNS SCP is detected without the prior removal of any tissue sample. Such in vivo detection procedures have the advantage of being less invasive than other detection methods, and are, moreover, capable of detecting the presence of PNS SCP in tissue which cannot be easily removed from the patient, such as brain tissue.

There are many different in vivo labels and methods of labeling known to those of ordinary skill in the art. Examples of the types of labels which can be used in the invention include radioactive isotopes and paramagnetic isotopes. Those of ordinary skill in the art will know of other suitable labels for binding to the antibodies used in the invention, or will be able to ascertain such, using routine experimentation. Furthermore, the binding of these labels to the antibodies can be done using standard techniques common to those of ordinary skill in the art.

For diagnostic in vivo imaging, the type of detection instrument available is a major factor in selecting a given radionuclide. The radionuclide chosen must have a type of decay which is detectable for a given type of instrument. In general, any conventional method for visualizing diagnostic imaging can be utilized in accordance with this invention. For example, positron emission tomography (PET), gamma, beta, and magnetic resonance imaging (MRI) detectors can be used to visualize diagnostic imagining.

The antibodies useful in the invention can also be labeled with paramagnetic isotopes for purposes of in vivo diagnosis. Elements which are particularly useful, as in Magnetic Resonance Imaging (MRI), include ¹⁵⁷ Gd, ⁵⁵ Mn, ¹⁶² Dy, and ⁵⁶ Fe.

The antibodies (or fragments thereof) useful in the invention are also particularly suited for use in in vitro immunoassays to detect the presence of a PNS SCP in body tissue, fluids (such as CSF), or cellular extracts. In such immunoassays, the antibodies (or antibody fragments) can be utilized in liquid phase or, preferably, bound to a solid-phase carrier, as described above.

In situ detection can be accomplished by removing a histological specimen from a patient, and providing the combination of labeled antibodies of the invention to such a specimen. The antibody (or fragment) is preferably provided by applying or by overlaying the labeled antibody (or fragment) to a biological sample. Through the use of such a procedure, it is possible to determine not only the presence of a PNS SCP, but also the distribution of a PNS SCP on the examined tissue. Using the invention, those of ordinary skill will readily perceive that any of a wide variety of histological methods (such as staining procedures) can be modified in order to achieve such in situ detection.

As used herein, an effective amount of a diagnostic reagent (such as an antibody or antibody fragment) is one capable of achieving the desired diagnostic discrimination and will vary depending on such factors as age, condition, sex, the extent of disease of the subject, counter-indications, if any, and other variables to be adjusted by the physician. The amount of such materials which are typically used in a diagnostic test are generally between 0.1 to 5 mg, and preferably between 0.1 to 0.5 mg.

The assay of the invention is also ideally suited for the preparation of a kit. Such a kit can comprise a carrier means being compartmentalized to receive in close confinement therewith one or more container means such as vials, tubes and the like, each of said container means comprising the separate elements of the immunoassay.

For example, there can be a container means containing a first antibody immobilized on a solid phase support, and a further container means containing a second detectably labeled antibody in solution. Further container means can contain standard solutions comprising serial dilutions of the PNS SCP to be detected. The standard solutions of a PNS SCP can be used to prepare a standard curve with the concentration of PNS SCP plotted on the abscissa and the detection signal on the ordinate. The results obtained from a sample containing a PNS SCP can be interpolated from such a plot to give the concentration of the PNS SCP.

Diagnostic Screening and Treatment. It is to be understood that although the following discussion is specifically directed to human patients, the teachings are also applicable to any animal that expresses at least one PNS SC. The diagnostic and screening methods of the invention are especially useful for a patient suspected of being at risk for developing a disease associated with an altered expression level of PNS SCP based on family history, or a patient in which it is desired to diagnose a PNS SCP-related disease.

According to the invention, presymptomatic screening of an individual in need of such screening is now possible using DNA encoding the PNS SCP protein of the invention. The screening method of the invention allows a presymptomatic diagnosis, including prenatal diagnosis, of the presence of a missing or aberrant PNS SC gene in individuals, and thus an opinion concerning the likelihood that such individual would develop or has developed a PNS SC-associated disease. This is especially valuable for the identification of carriers of altered or missing PNS SC genes, for example, from individuals with a family history of a PNS SC-related pathology. Early diagnosis is also desired to maximize appropriate timely intervention.

In one preferred embodiment of the method of screening, a tissue sample would be taken from such individual, and screened for (1) the presence of the "normal" PNS SCP gene; (2) the presence of PNS SCP mRNA and/or (3) the presence of PNS SCP protein. The normal human gene can be characterized based upon, for example, detection of restriction digestion patterns in "normal" versus the patient's DNA, including RFLP analysis, using DNA probes prepared against the PNS SCP sequence (or a functional fragment thereof) taught in the invention. Similarly, PNS SCP mRNA can be characterized and compared to normal PNS SCP mRNA (a) levels and/or (b) size as found in a human population not at risk of developing PNS SCP-associated disease using similar probes. Lastly, PNS SCP protein can be (a) detected and/or (b) quantitated using a biological assay for PNS SCP activity or using an immunological assay and PNS SCP antibodies. When assaying PNS SCP protein, the immunological assay is preferred for its speed. An (I) aberrant PNS SCP DNA size pattern, and/or (2) aberrant PNS SCP mRNA sizes or levels and/or (3) aberrant PNS SCP protein levels would indicate that the patient is at risk for developing a PNS SCP-associated disease.

The screening and diagnostic methods of the invention do not require that the entire PNS SCP DNA coding sequence be used for the probe. Rather, it is only necessary to use a fragment or length of nucleic acid that is sufficient to detect the presence of the PNS SCP gene in a DNA preparation from a normal or affected individual, the absence of such gene, or an altered physical property of such gene (such as a change in electrophoretic migration pattern).

Prenatal diagnosis can be performed when desired, using any known method to obtain fetal cells, including amniocentesis, chorionic villous sampling (CVS), and fetoscopy. Prenatal chromosome analysis can be used to determine if the portion of the chromosome possessing the normal PNS SCP gene is present in a heterozygous state.

Overview of PNS SCP Purification and Crystallization Methods. In general, a PNS SCP as a membrane protein, is purified in soluble form using detergents (e.g., octyglucosides) or other suitable amphiphillic molecules. The resulting PNS SCP is in sufficient purity and concentration for crystallization. The purified PNS SCP is then isolated and assayed for biological activity and for lack of aggregation (which interferes with crystallization). The purified and cleaved PNS SCP preferably runs as a single band under reducing or nonreducing polyacrylamide gel electrophoresis (PAGE) (nonreducing is used to evaluate the presence of cysteine bridges). The purified PNS SCP is preferably crystallized under varying conditions of at least one of the following: pH, buffer type, buffer concentration, salt type, polymer type, polymer concentration, other precipitating ligands and concentration of purified and cleaved PNS SCP by known methods. See, e.g., Michel, Trends in Biochem. Sci. 8:56-59 (1983); Deisenhofer et al J. Mol. Biol 180:385-398 (1984); Weiss et al. FEBS Let. 267:268-272 (1990). Blundell, et al. Protein Crystallography Academic Press, London (1976); Oxender et al. eds., Protein Engineering Liss, New York (1986); McPherson; The Preparation and Analysis of Protein Crystals Wiley, N.Y. (1982); or the methods provided in a commercial kit, such as CRYSTAL SCREEN (Hampton Research, Riverside, Calif.). The crystallized protein is also tested for at least one SC activity and differently sized and shaped crystals are further tested for suitability in X-ray diffraction. Generally, larger crystals provide better crystallography than smaller crystals, and thicker crystals provide better crystallography than thinner crystals. See, e.g., Blundell., infra; Oxender, infra; McPherson, infra; Wyckoff et al, eds., Diffraction Methods for Biological Macromolecules, Vols. 114-115. Methods in Enzymology, Orlando, Fla. Academic Press (1985).

Protein Crystallization Methods. The hanging drop method is preferably used to crystallize a purified soluble, PNS SCP protein. See, e.g., Tayloretal., J. Mol. Biol. 226:1287-1290 (1992); Takimoto et al. (1992), infra; CRYSTAL SCREEN, Hampton Research. A mixture of the protein and precipitant can include the following: • pH (e.g., 4-10); • buffer type (e.g., tromethamine (TRIZMA), sodium azide, phosphate, sodium, or cacodylate acetates, imidazole, Tris HCl, sodium hepes); • buffer concentration (e.g., 0.1-100 mM); • salt type (e.g., sodium azide, calcium chloride, sodium citrate, magnesium chloride, ammonium acetate, ammonium sulfate, potassium phosphate, magnesium acetate, zinc acetate; calcium acetate); • polymer type and concentration: (e.g., polyethylene glycol (PEG) 1-50%, type 6000-10,000); • other precipitating ligands (salts: potassium, sodium, tartrate, ammonium sulfate, sodium acetate, lithium sulfate, sodium formate, sodium citrate, magnesium formate, sodium phosphate, potassium phosphage; organics: 2-propanol; non-volatile: 2-methyl-2,4-pentanediol); and • concentration of purified PNS SCP (e.g., 0.1-100 mg/ml, with added amphiphillic molecules (detergents such as octylgluosides)). See, e.g. CRYSTAL SCREEN, Hampton Research.

The above mixtures are used and screened by varying at least one of pH, buffer type; buffer concentration, precipitating salt type or concentration, PEG type, PEG concentration, and cleaved protein concentration. Crystals ranging in size from 0.1-1.5 mm are formed in 1-14 days. These crystals diffract X-rays to at least 10 Å resolution, such as 1.5-10.0 Å, or any range of value therein, such as 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5 or 3, with 3.5 Å or less being preferred for the highest resolution. In addition to diffraction patterns having this highest resolution, lower resolution, such as 25-3.5 Å can further be used.

Protein Crystals. Crystals appear after 1-14 days and continue to grow on subsequent days. Some of the crystals are removed, washed, and assayed for biological activity, which activity is preferred for using in further characterizations. Other washed crystals are preferably run on a stained gel and those that migrate in the same position as the purified cleaved PNS SCP are preferably used. From two to one hundred crystals are observed in one drop and crystal forms can occur, such as, but not limited to, bipyramidal, rhomboid, and cubic. Initial X-ray analyses are expected to indicate that such crystals diffract at moderately high to high resolution. When fewer crystals are produced in a drop, they can be much larger size, e.g., 0.2-1.5 mm.

PNS SCP X-ray Crystallography Methods. The crystals so produced for a PNS SCP are X-ray analyzed using a suitable X-ray source. A suitable number of diffraction patterns are obtained. Crystals are preferably stable for at least 10 hrs in the X-ray beam . Frozen crystals (e.g., -220 to -50° C.) are optionally used for longer X-ray exposures (e.g., 4-72 hrs), the crystals being relatively more stable to the X-rays in the frozen state. To collect the maximum number of useful reflections, multiple frames are optionally collected as the crystal is rotated in the X-ray beam, e.g., for 12-96 hrs. Larger crystals (>0.2 mm) are preferred, to increase the resolution of the X-ray diffraction. Crystals are preferably analyzed using a synchrotron high energy X-ray source. Using frozen crystals, X-ray diffraction data is collected on crystals that diffract to a resolution of 10-1.5 Å, with lower resolutions also useful, such as 25-10 Å, sufficient to solve the three-dimensional structure of a PNS SCP in considerable detail, as presented herein.

Computer Related Embodiments. An amino acid sequence of a PNS SCP and/or x-ray diffraction data, useful for computer molecular modeling of a PNS SCP or a portion thereof, can be "provided" in a variety of mediums to facilitate use thereof. As used herein, provided refers to a manufacture, which contains a PNS SCP amino acid sequence and/or x-ray diffraction data of the present invention, e.g., the amino sequence provided in FIGS. 1, 8, 10 or 11, a representative fragment thereof, or an amino acid sequence having at least 80-100% overall identity to a 5-2005 amino acid fragment of an amino acid sequence of FIGS. 11A-F or a variant thereof. Such a method provides the amino acid sequence and/or x-ray diffraction data in a form which allows a skilled artisan to analyze and molecular model the three dimension structure of a PNS SCP or subdomain thereof.

In one application of this embodiment, PNS SCP, or at least one subdomain thereof, amino acid sequence and/or x-ray diffraction data of the present invention is recorded on computer readable medium. As used herein, "computer readable medium" refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as optical discs or CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable mediums can be used to create a manufacture comprising computer readable medium having recorded thereon a n amino acid sequence and/or x-ray diffraction data of the present invention.

As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled artisan can readily adopt any of the presently know methods for recording information on computer readable medium to generate manufactures comprising an amino acid sequence and/or x-ray diffraction data information of the present invention.

A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon an amino acid sequence and/or x-ray diffraction data of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the sequence and x-ray data information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and MicroSoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of dataprocessor structuring formats (e.g. text file or database) in order to obtain computer readable medium having recorded thereon the information of the present invention.

By providing the PNS SCP sequence and/or x-ray diffraction data on computer readable medium, a skilled artisan can routinely access the sequence and x-ray diffraction data to model a PNS SCP, a subdomain thereof, or a ligand thereof. Computer algorythms are publicly and commercially available which allow a skilled artisan to access this data provided in a computer readable medium and analyze it for molecular modeling and/or RDD.

The present invention further provides systems, particularly computer-based systems, which contain the sequence and/or diffraction data described herein. Such systems are designed to do molecular modeling and RDD for a PNS SCP or at least one subdomain thereof.

As used herein, "a computer-based system" refers to the hardware means, software means, and data storage means used to analyze the sequence and/or x-ray diffraction data of the present invention. The minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means. A skilled artisan can readily appreciate which of the currently available computer-based system are suitable for use in the present invention.

As stated above, the computer-based systems of the present invention comprise a data storage means having stored therein a PNS SCP or fragment sequence and/or x-ray diffraction data of the present invention and the necessary hardware means and software means for supporting and implementing an analysis means. As used herein, "data storage means" refers to memory which can store sequence or x-ray diffraction data of the present invention, or a memory access means which can access manufactures having recorded thereon the sequence or x-ray data of the present invention.

As used herein, "search means" or "analysis means" refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence or x-ray data stored within the data storage means. Search means are used to identify fragments or regions of a PNS SCP which match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. A skilled artisan can readily recognize that any one of the available algorithms or implementing software packages for conducting computer analyses that can be adapted for use in the present computer-based systems.

As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration or electron density map which is formed upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, but are not limited to, enzymic active sites, structural subdomains, epitopes, functional domains and signal sequences. A variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention.

A variety of comparing means can be used to compare a target sequence or target motif with the data storage means to identify structural motifs or electron density maps. A skilled artisan can readily recognize that any one of the publicly available computer modeling programs can be used as the search means for the computer-based systems of the present invention.

One application of this embodiment is provided in FIG. 12. FIG. 12 provides a block diagram of a computer system 102 that can be used to implement the present invention. The computer system 102 includes a processor 106 connected to a bus 104. Also connected to the bus 104 are a main memory 108 (preferably implemented as random access memory, RAM) and a variety of secondary storage memory 110, such as a hard drive 112 and a removable storage medium 114. The removable medium storage device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A removable storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, etc.) containing control logic and/or data recorded therein may be inserted into the removable medium storage medium 114. The computer system 102 includes appropriate software for reading the control logic and/or the data from the removable medium storage device 114 once inserted in the removable medium storage device 114. A monitor 120 can be used as connected to the bus 104 to visualize the structure determination data.

Amino acid, encoding nucleotide or other sequence and/or x-ray diffraction data of the present invention may be stored in a well known manner in the main memory 108, any of the secondary storage devices 110, and/or a removable storage device 116. Software for accessing and processing the amino acid sequence and/or x-ray diffraction data (such as search tools, comparing tools, etc.) reside in main memory 108 during execution.

Three Dimensional Structure Determination. One or more computer modeling steps and/or computer algorythms are used to provide a molecular 3-D model of a cleaved PNS SCP, using amino acid sequence data from FIGS. 1, 8, 10 or 11 (or variants thereof) and/or x-ray diffraction data. If only the amino acid sequence is used, for three-dimensional structure determination then a suitable modeling program can be used, e.g., LINUS (Rose et. al. Proteins: Structure, Function and Genetics (June, 1995) and references cited herein. It is preferred that the PNS SCP model has no or Ala-substituted (for surface) residues in disallowed regions of the Ramachandran plot, and gives a positive 3D-1D profile (Luthy et al, Nature 356:83-85 (1992)), suggesting that all the residues are in acceptable environments (Kraulis (1991), infoa). Alternatively, the dissallowed regions can be corrected by the use of suitable algorythms, such as the RAVE program described herein. Phase determination is optionally used for solving the three-dimensional structure of a cleaved PNS SCP. This structure can then be used for RDD of modulators of PNS SCP neuraminidase, endothelin cathepsin A or other biological activity, e.g., which is relevant to a PNS SCP related pathology.

Density Modification and Map Interpretation. Electron density maps can be calculated using such programs as those from the CCP4 computing package (SERC (UK) Collaborative Computing Project 4, Daresbury Laboratory, UK, 1979). Cycles of two-fold averaging can further be used, such as with the program RAVE (Kleywegt & Jones, Bailey et al, eds., First Map to Final Model, SERC Daresbury Laboratory, UK, pp 59-66 (1994)) and gradual model expansion. For map visualization and model building a program usch as "O" (Jones (1991), infra) can be used.

Refinement and Model Validation. Rigid body and positional refinement can be carried out using a program such as X-PLOR (Brunger (1992), infra), e.g., with the stereochemical parameters of Engh and Huber (Acta Cryst. A47:392-400 (1991)). If the model at this stage in the averaged maps still misses residues (e.g., at least 5-10 per subunit), the some or all of the missing residues can be incorporated in the model during additional cycles of positional refinement and model building. The refinement procedure can start using data from lower resolution (e.g., 25-10 Å to 10-3.0 Å and then gradually extended to include data from 12-6 Å to 3.0-1.5 Å). β-values for individual atoms can be refined once data between 2.9 and 1.5 Å has been added. Subsequently waters can be gradually added. A program such as ARP (Lamzin and Wilson, Acta Cryst D49: 129-147 (1993)) can be used to add crystallographic waters and as a tool to check for bad areas in the model. Programs such as PROCHECK (Lackowski et al, J. Appl. Cryst. 26:283-291 (1993)), WHATIF(Vriend, J. Mol. Graph. 8:52-56 (1990)) and PROFILE 3D (Liuthy et al., Nature 356:83-85 (1992)), as well as the geometrical analysis generated by X-PLOR can be been used to check the structure for errors. For the final refinement cycle, 20-5% of the weakest data can be rejected using a IF_(obs) I/σ cutoff and anisotropic scaling between F_(obs) and F_(calc) applied after careful assessment of the quality and completeness of the data

Structure Analysis. A program such as DSSP can be used to assign the secondary structure elements (Kabsch and Sander (1983), infra). A program such as SUPPOS (from the BIOMOL crystallographic computing package) can be used to for some or all of the least-squares superpositions of various models and parts of models. Solvent accessible surfaces and electrostatic potentials can be calculated using such programs as GRASP (Nicholls et al (1991), infra).

Structure Determination. The structure of a PNS SCP can thus be solved with the molecular replacement procedure such as by using X-PLOR (Brunger (1992), infra). A partial search model for the monomer can be constructed using a related protein, such as wheat serine carboxypeptidase structure (Liao et al. (1992), infra). The rotation and translation function can be used to yield two or more orientations and positions for two subunits to form a physiological dimer as determined based on their interactions. Cyclical two-fold density averaging can also be done using the RAVE program and model expansion can also be used to add missing residues for each monomer, resulting in a model with 95-99.9% of the total number residues. The model can be refined in a program such as X-PLOR (Bruhnger (1992), supra), to a suitable crystallographic R_(factor). The model data is then saved on computer readable medium for use in further analysis, such as rational drug design.

Rational Design of Drugs that Interact with the PNS SCP. The determination of the three dimensional structure of a cleaved PNS SCP, as described herein, provides a basis for the design of new and specific ligands for the diagnosis and/or treatment of at least one PNS SCP-related pathology. Several approaches can be taken for the use of the crystal structure of a PNS SCP in the rational design of ligands of this protein. A computer-assisted, manual examination of the active site structure is optionally done. The use of software such as GRID (Goodford, J. Med. Chem. 28:849-857 (1985)) a program that determines probable interaction sites between probes with various functional group characteristics and the enzyme surface--is used to analyze the active site to determine structures of inhibiting compounds. The program calculations, with suitable inhibiting groups on molecules (e.g., protonated primary amines) as the probe, are used to identify potential hotspots around accessible positions at suitable energy contour levels. Suitable ligands, as inhibiting or stimulating modulating compounds or compositions, are then tested for modulating activities of at least one PNS SCP.

A diagnostic or therapeutic PNS SCP modulating ligand of the present invention can be, but is not limited to, at least one selected from a nucleic acid, a compound, a protein, an element, a lipid, an antibody, a saccharide, an isotope, a carbohydrate, an imaging agent, a lipoprotein, a glycoprotein, an enzyme, a detectable probe, and antibody or fragment thereof, or any combination thereof, which can be detectably labeled as for labeling antibodies. Such labels include, but are not limited to, enzymatic labels, radioisotope or radioactive compounds or elements, fluorescent compounds or metals, chemiluminescent compounds and bioluminescent compounds. Alternatively, any other known diagnostic or therapeutic agent can be used in a method of the invention.

After preliminary experiments are done to determine the K_(m) of the substrate with each enzyme activity of a PNS SCP, the time-dependent nature of modulation of ligand K_(i) values are determined, (e.g., by the method of Henderson (Biochem. J. 127:321-333 (1972)). For example, the substrate (or blank where appropriate) and enzyme are pre-incubated in buffer. Reactions are initiated by the addition of substrate. Aliquots are removed over a suitable time course and each quenched by addition into the aliquots of suitable quenching solution (e.g., sodium hydroxide in aqueous ethanol). The concentration of product is determined, e.g., fluorometrically, using a spectrometer. Plots of fluorescence against time can be close to linear over the assay period, and are used to obtain values for the initial velocity in the presence (V_(i)) or absence (V_(o)) of ligand. Error is present in both axes in a Henderson plot, making it inappropriate for standard regression analysis (Leatherbarrow, Trends Biochem. Sci. 15:455-458 (1990)). Therefore, K_(i) values is obtained from the data by fitting to a modified version of the Henderson equation for competitive inhibition:

    Qr.sup.2 +(E.sub.t -Q-I.sub.r)r-E.sub.t =0

where (using the notation of Henderson (Biochem. J. 127:321-333 (1972)): ##EQU1##

This equation is solved for the positive root with the constraint that

    Q=K.sub.i ((A.sub.t +K.sub.a)/K.sub.a)

using PROCNLIN from SAS (SAS Institute Inc., Cary, N.C., USA) which performs nonlinear regression using least-square techniques. The iterative method used is optionally the multivariate secant method, similar to the Gauss-Newton method, except that the derivatives in the Taylor series are estimated from the histogram of iterations rather than supplied analytically. A suitable convergence criterion is optionally used, e.g., where there is a change in loss function of less than 10⁻⁸.

Once modulating ligands are found and isolated or synthesized, crystallographic studies of the compounds complexed to a PNS SCP are performed. As a non-limiting example, PNS SCP crystals are soaked for 2 days in 0.01-100 mM ligand and X-ray diffraction data are collected on an area detector and/or an image plate detector (e.g., a Mar image plate detector) using a rotating anode X-ray source. Data are collected to as high a resolution as possible, e.g., 1.5-3.5 Å, and merged with an R-factor on suitable intensities. An atomic model of the inhibitor is built into the difference Fourier map (F_(inhibitor) complex -F_(native)). The model can be refined to a solution in a cycle of simulated annealing (Branger (1987), infra) involving 10-500 cycles of energy refinement, 100-10,000 1-FS steps of room temperature dynamics and/or 10-500 more cycles of energy refinement. Harmonic restraints are also used for the atom refinement, except for atoms within a 10-15 Å radius of the inhibitor. An R-factor is selected for the model for both the r.m.s. deviations from the ideal bond lengths, as well as for the angles, respectively. Direct measurements of enzyme inhibition provide further confirmation that the modeled ligands are modulators of at least one biological activity of a PNS SC.

Ligands of a PNS SCP, based on the crystal structure of this enzyme, are thus also provided by the present invention. Demonstration of clinically useful levels, e.g., in vivo activity is also important. In evaluating PNS SCP inhibitors for biological activity in animal models (e.g., rat, mouse, rabbit) using various oral and parenteral routes of administration are evaluated. Using this approach, it is expected that modulation of a PNS SCP occurs in suitable animal models, using the ligands discovered by molecular modeling and x-ray crystallography.

Diagnostic and/or Therapeutic Agents. A diagnostic or therapeutic PNS SCP modulating agent or ligand of the present invention can be, but is not limited to, at least one selected from a nucleic acid, a compound, a protein, an element, a lipid, an antibody, a saccharide, an isotope, a carbohydrate, an imaging agent, a lipoprotein, a glycoprotein, an enzyme, a detectable probe, and antibody or fragment thereof, or any combination thereof, which can be detectably labeled as for labeling antibodies, as described herein. Such labels include, but are not limited to, enzymatic labels, radioisotope or radioactive compounds or elements, fluorescent compounds or metals, chemiluminescent compounds and bioluminescent compounds. Alternatively, any other known diagnostic or therapeutic agent can be used in a method of the invention.

A therapeutic agent used in the invention can have a therapeutic effect on the target cell as a cell or neuron of the peripheral nervous system, the effect selected from, but not limited to: correcting a defective gene or protein, a drug action, a toxic effect, a growth stimulating effect, a growth inhibiting effect, a metabolic effect, a catabolic affect, an anabolic effect, a neurohumoral effect, a cell differentiation stimulatory effect, a cell differentiation inhibitory effect, a neuromodulatory effect, a pluripotent stem cell stimulating effect, and any other known therapeutic effects that modulates at least one SC in a cell of the peripheral nervous system can be provided by a therapeutic agent delivered to a target cell via pharmaceutical administration or via a delivery vector according to the invention.

A therapeutic nucleic acid as a therapeutic agent can have, but is not limited to, at least one of the following therapeutic effects on a target cell: inhibiting transcription of a DNA sequence; inhibiting translation of an RNA sequence; inhibiting reverse transcription of an RNA or DNA sequence; inhibiting a post-translational modification of a protein; inducing transcription of a DNA sequence; inducing translation of an RNA sequence; inducing reverse transcription of an RNA or DNA sequence; inducing a post-translational modification of a protein; transcription of the nucleic acid as an RNA; translation of the nucleic acid as a protein or enzyme; and incorporating the nucleic acid into a chromosome of a target cell for constitutive or transient expression of the therapeutic nucleic acid.

Therapeutic effects of therapeutic nucleic acids can include, but are not limited to: turning off a defective gene or processing the expression thereof, such as antisense RNA or DNA; inhibiting viral replication or synthesis; gene therapy as expressing a heterologous nucleic acid encoding a therapeutic protein or correcting a defective protein; modifying a defective or underexpression of an RNA such as an hnRNA, an mRNA, a tRNA, or an rRNA; encoding a drug or prodrug, or an enzyme that generates a compound as a drug or prodrug in pathological or normal cells expressing the chimeric receptor; and any other known therapeutic effects.

A therapeutic nucleic acid of the invention which encodes, or provides the therapeutic effect any known toxin, prodrug or gene drug for delivery to pathogenic nervous cells can also include genes under the control of a tissue specific transcriptional regulatory sequence (TRSs) specific for pathogenic SC containing cells. Such TRSs would further limit the expression of the therapeutic agent in the target cell, according to known methods.

Non-limiting examples of such PNS SCP modulating agents or ligands of the present invention and methods thereof include methyl/halophenyl-substituted piperizine compounds, such as lidoflazine (see, e.g., Merck Index Monograph 5311 and U.S. Pat. No. 3,267,104, both entirely incoporated herein by reference). Such compounds were tested and found to inhibit sodium channel activity of at least one PNS SCP of the present invention in cell lines expressing at least one PNS SCP, such as PC 12, PK 1-4 and other isolated or recombinant cells expressing at least one PNS SCP of the present invention. Accordingly, the present invention provides PNS SCP modulating agents or ligands as methyl/halophenyl-substituted piperizines. The substitutions can include alkyl- and/or halophenyl-substituted piperizines.

Pharmaceutical/Diagnostic Administration. Using PNS SCP modulating compounds or compositions (including antagonists and agonists as described above) the present invention further provides a method for modulating the activity of the PNS SCP protein in a cell. In general, agents (antagonists or agonists) which have been identified to inhibit or enhance the activity of PNS SCP can be formulated so that the agent can be contacted with a cell expressing a PNS SCP protein in vivo. The contacting of such a cell with such an agent results in the in vivo modulation of the activity of the PNS SCP proteins. So long as a formulation barrier or toxicity barrier does not exist, agents identified in the assays described above will be effective for in vivo use.

In another embodiment, the invention relates to a method of administering PNS SCP or a PNS SCP modulating compound or composition (including PNS SCP antagonists and agonists) to an animal (preferably, a mammal (specifically, a human)) in an amount sufficient to effect an altered level of PNS SCP in the animal. The administered PNS SC or PNS SCP modulating compound or composition could specifically effect PNS SCP associated functions. Further, since PNS SCP is expressed in peripheral nervous system tissue, administration of PNS SC or PNS SCP modulating compound or composition could be used to alter PNS SCP levels in the peripheral nervous system.

PNS SCP antagonists can be used to treat pain due to trauma or pathology involving the central or peripheral nervous system, or pathologies related to the abnormally high levels of expression of at least one naturally occurring nervous system specific (NS) sodium channel (SC), where a PNS SCP antagonist also inhibits at least one NS SC, or where the pain is mediated to some extent by PN SC. Such pathologies, include, but are not limited to; inflammatory diseases, neuropathies (e.g., diabetic neuropathy), dystrophies (e.g., reflex sympathetic dystrophy, post-herpetic neuralgia); trauma (tissue damage by any cause); focal pain by any cause.

Inflammatory diseases can include, but are not limited to, chronic inflammatory pathologies and vascular inflammatory pathologies. Chronic inflammatory pathologies include, but are not limited to sarcoidosis, chronic inflammatory bowel disease, ulcerative colitis, and Crohn's pathology and vascular inflammatory pathologies, such as, but not limited to, disseminated intravascular coagulation, atherosclerosis, and Kawasaki's pathology.

PNS SCP agonists can be used to treat pathologies involving the central or peripheral nervous system, or pathologies related to the abnormally low levels of expression of at least one naturally occurring nervous system specific (NS) sodium channel (SC), where a PNS SCP agonist also enhances or stimulates at least one NS SC. Such pathologies, include, but are not limited to, neurodegenerative diseases, diseases of the gastrointestinal tract due to dysfunction of the enteric nervous system (e.g., colitis, ileitis, inflammatory bowel syndrome); diseases of the cardiovascular system (e.g., hypertension and congestive heart failure); diseases of the genitourinary tract involving sympathetic and parasympathetic innervation (e.g., benign prostrate hyperplasia, impotence); diseases of the neuromuscular system (e.g., muscular dystrophy, multiple sclerosis, epilepsy).

Neurodegenerative diseases can include, but are not limited to, demyelinating diseases, such as multiple sclerosis and acute transverse myelitis; hyperkinetic movement disorders, such as Huntington's Chorea and senile chorea; hypokinetic movement disorders, such as Parkinson's disease; progressive supranucleo palsy; spinocerebellar degenerations, such as spinal ataxia, Friedreich's ataxia; multiple systems degenerations (Mencel, Dejerine-Thomas, Shi-Drager, and Machado-Joseph); and systemic disorders (Refsum's disease, abetalipoprotemia, ataxia, telangiectasia, and mitochondrial multi-system disorder); demyelinating core disorders, such as multiple sclerosis. acute transverse myelitis; disorders of the motor unit, such as neurogenic muscular atrophies (anterior horn cell degeneration, such as amyotrophic lateral sclerosis, infantile spinal muscular atrophy and juvenile spinal muscular atrophy); or any subset thereof.

Pharmaceutical/diagnostic administration of diagnostic/pharmaceutical compound or composition of the invention, for a PNS SC related pathology can be administered by any means that achieve its intended purpose, for example, to treat or prevent a cancer or precancerous condition.

The term "protection", as in "protection from infection or disease", as used herein, encompasses "prevention," "suppression" or "treatment." "Prevention" involves administration of a Pharmaceutical composition prior to the induction of the disease. "Suppression" involves administration of the composition prior to the clinical appearance of the disease. "Treatment" involves administration of the protective composition after the appearance of the disease. It will be understood that in human and veterinary medicine, it is not always possible to distinguish between "preventing" and "suppressing" since the ultimate inductive event or events can be unknown, latent, or the patient is not ascertained until well after the occurrence of the event or events. Therefore, it is common to use the term "prophylaxis" as distinct from "treatment" to encompass both "preventing" and "suppressing" as defined herein. The term "protection," as used herein, is meant to include "prophylaxis." See, e.g., Berker, infra, Goodman, infra, Avery, infra and Katzung, infra, which are entirely incorporated herein by reference, including all references cited therein. The "protection" provided need not be absolute, i.e., the disease need not be totally prevented or eradicated, provided that there is a statistically significant improvement relative to a control population. Protection can be limited to mitigating the severity or rapidity of onset of symptoms of the disease.

At least one PNS SC modulating compound or composition of the invention can be administered by any means that achieve the intended purpose, using a pharmaceutical composition as previously described.

For example, administration can be by various parenteral routes such as subcutaneous, intravenous, intradermal, intramuscular, intraperitoneal, intranasal, intracranial, transdermal, or buccal routes. Alternatively, or concurrently, administration can be by the oral route. Parenteral administration can be by bolus injection or by gradual perfusion over time.

An additional mode of using of a diagnostic/pharmaceutical compound or composition of the invention is by topical application. A diagnostic/pharmaceutical compound or composition of the invention can be incorporated into topically applied vehicles such as salves or ointments.

For topical applications, it is preferred to administer an effective amount of a diagnostic/pharmaceutical compound or composition according to the invention to target area, e.g., skin surfaces, mucous membranes, and the like, which are adjacent to peripheral neurons which are to be treated. This amount will generally range from about 0.0001 mg to about 1 g of a PNS SC modulating compound per application, depending upon the area to be treated, whether the use is diagnostic, prophylactic or therapeutic, the severity of the symptoms, and the nature of the topical vehicle employed. A preferred topical preparation is an ointment, wherein about 0.001 to about 50 mg of active ingredient is used per cc of ointment base.

A typical regimen for treatment or prophylaxis comprises administration of an effective amount over a period of one or several days, up to and including between one week and about six months.

It is understood that the dosage of a diagnostic/pharmaceutical compound or composition of the invention administered in vivo or in vitro will be dependent upon the age, sex, health, and weight of the recipient, kind of concurrent treatment, if any, frequency of treatment, and the nature of the diagnostic/pharmaceutical effect desired. The ranges of effective doses provided herein are not intended to be limiting and represent preferred dose ranges. However, the most preferred dosage will be tailored to the individual subject, as is understood and determinable by one skilled in the relevant arts. See, e.g., Berkow et al, eds., The Merck Manual. 16th edition, Merck and Co., Rahway, N.J., 1992, Goodman et al., eds., Goodman and Gilman's The Pharmacological Basis of Therapeutics, 8th edition, Pergamon Press, Inc., Elmsford, N.Y., (1990); Avery's Drug Treatment: Principles and Practice of Clinical Pharmacology and Therapeutics, 3rd edition, ADIS Press, LTD., Williams and Wilkins, Baltimore, Md. (1987), Ebadi, Pharmacology, Little, Brown and Co., Boston, (1985); Osol et al., eds. Remington's Pharmaceutical Sciences, 18th edition, Mack Publishing Co., Easton, Pa. (1990); Katzung, Basic and Clinical Pharmacology, Appleton and Lange, Norwalk, Conn. (1992), which references are entirely incorporated herein by reference.

The total dose required for each treatment can be administered by multiple doses or in a single dose. The diagnostic/pharmaceutical compound or composition can be administered alone or in conjunction with other diagnostics and/or pharmaceuticals directed to the pathology, or directed to other symptoms of the pathology.

Effective amounts of a diagnostic/pharmaceutical compound or composition of the invention are from about 0.1 μg to about 100 mg/kg body weight, administered at intervals of 4-72 hours, for a period of 2 hours to 1 year, and/or any range or value therein, such as 0.0001-1.0, 1-10, 10-50 and 50-100, 0.0001-0.001, 0.001-0.01, 0.01-0.1, 0.1-1.0, 1.0-1 0, 5-10, 10-20, 20-50 and 50-100 mg/kg, at intervals of 1-4, 4-10, 10-16, 16-24, 24-36, 36-48,48-72 hours, for a period of 1-14, 14-28, or 30-44 days, or 1-24 weeks, or any range or value therein.

The recipients of administration of compounds and/or compositions of the invention can be any vertebrate animal, such as mammals, birds, bony fish, frogs and toads. Among mammals, the preferred recipients are mammals of the Orders Primata (including humans, apes and monkeys), Arteriodactyla (including horses, goats, cows, sheep, pigs), Rodenta (including mice, rats, rabbits, and hamsters), and Carnivora (including cats, and dogs). Among birds, the preferred recipients are turkeys, chickens and other members of the same order. The most preferred recipients are humans.

Gene Therapy. A delivery vector of the present invention can be, but is not limited to, a viral vector, a liposome, an anti-PNS SCP or anti-SC antibody, or a SC ligand, one or more of which delivery vectors is associated with a diagnostic or therapeutic agent.

The delivery vector can comprise any diagnostic or therapeutic agent which has a therapeutic or diagnostic effect on the target cell. The target cell specificity of the delivery vector is thus provided by use of a target cell specific delivery vector.

The delivery vector can also be a recombinant viral vector comprising at least one binding domain selected from the group consisting of an antibody or fragment, a chimeric binding site antibody or fragment, a target cell or specific ligand, a receptor which binds a target cell ligand, an anti-idiotypic antibody, a liposome or other component which is specific for the target cell. A PNS SCP can be already associated with the target cell, or the delivery vector can bind the target cell via a ligand to a target cell receptor or vice versa.

Thus, the therapeutic or diagnostic agent, such as a therapeutic or diagnostic nucleic acid, protein, drug, compound composition and the like, is delivered preferentially to the target cell, e.g., where the nucleic acid is preferably incorporated into the chromosome of the target cell, to the partial or complete exclusion of non-target cells.

The invention is thus intended to provide delivery vectors, containing one or more therapeutic and/or diagnostic agents, including vectors suitable for gene therapy.

In a method of treating a PNS SCP-associated disease in a patient in need of such treatment, functional PNS SCP DNA can be provided to the PNS cells of such patient in a manner and amount that permits the expression of the PNS SCP protein provided by such gene, for a time and in a quantity sufficient to treat such patient, such as a suitable delivery vector. Many vector systems are known in the art to provide such delivery to human patients in need of a gene or protein missing from the cell. For example, retrovirus systems can be used, especially modified retrovirus systems and especially herpes simplex virus systems. Such methods are provided for, in, for example, the teachings of Breakefield, et al., The New Biologist 3:203-218 (1991); Huang, Q. et al., Experimental Neurology 115:303-316 (1992), WO93/03743 and WO90/09441. Delivery of a DNA sequence encoding a functional PNS SCP protein will effectively replace the missing or mutated PNS SCP gene of the invention.

In another embodiment of this invention, the PNS SCP modulating compound or composition is expressed as a recombinant gene in a cell, so that the cells can be transplanted into a mammal, preferably a human in need of gene therapy. To provide gene therapy to an individual, a genetic sequence which encodes for all or part of the PNS SCP modulating compound or composition is added into a vector and introduced into a host cell. Examples of diseases that can be suitable for gene therapy include, but are not limited to, neurodegenerative diseases or disorders, Alzheimer's, schizophrenia, epilepsy, neoplasms and cancer. Examples of vectors that can be used in gene therapy include, but are not limited to, defective retroviral, adenoviral, or other viral vectors (Mulligan, R. C., Science 260:926-932 (1993)). See Anderson, Gene Therapy, 246 J. Amer. Med. Assn. 2737 (1980); Friedmann, Progress toward human gene therapy, 244 Science 1275 (1989); Anderson, 256 Science 808 (1992); human gene therapy protocols published in Human Gene Therapy, Mary Ann Liebert Publishers, N.Y. (1990-1994); Bank et al., 565 Ann. N.Y. Acad. Sci. 37 (1989); LTR-Vectors (U.S. Pat. No. 4,405,712); Ausubel , infra, §§ 9.10-9.17; Jon A. Wolff., ed, Gene Therapeutics: methods and applications of direct gene transfer, Birkhauser, Boston (1994).

The means by which the vector carrying the gene can be introduced into the cell include but is not limited to, microinjection, electroporation, transduction, or transfection using DEAE-Dextran, lipofection, calcium phosphate or other procedures known to one skilled in the art (Sambrook infra; Ausubel, infra).

Preparations for parenteral administration include sterile or aqueous or non-aqueous solutions, suspensions, and emulsions. Examples of non-aqueous solvents are propylene glycol, polyethylene glycol, vegetable oils such as olive oil, and injectable organic esters such as ethyl oleate. Aqueous carriers include water, alcoholic/aqueous solutions, emulsions or suspensions, including saline and buffered media. Parenteral vehicles include sodium chloride solution, Ringer's dextrose and sodium chloride, lactated Ringer's, or fixed oils. Intravenous vehicles include fluid and nutrient replenishers, electrolyte replenishers, such as those based on Ringer's dextrose, and the like. Preservatives and other additives can also be present, such as, for example, antimicrobials, antioxidants, chelating agents, inert gases and the like. See, generally, Osol et al., eds. Remington's Pharmaceutical Science, 16th Ed., (1980).

In another embodiment, the invention relates to a pharmaceutical composition comprising PNS SC or PNS SCP modulating compound or composition in an amount sufficient to alter PNS SCP associated activity, and a pharmaceutically acceptable diluent, carrier, or excipient. Appropriate concentrations and dosage unit sizes can be readily determined by one skilled in the art (See, e.g., Osol et al. ed., Remington's Pharmaceutical Sciences, 16th Ed., Mack, Easton Pa. (1980) and WO 91/19008).

Included as well in the invention are pharmaceutical compositions comprising an effective amount of at least one PNS SCP antisense oligonucleotide, in combination with a pharmaceutically acceptable carrier. Such antisense oligos include, but are not limited to, at least one nucleotide sequence of 12-500 bases in length which is complementary to a DNA sequence of SEQ ID NO:1, or a DNA sequence encoding at least 4 amino acids of SEQ ID NO:2 or FIGS. 11A-11E.

Alternatively, the PNS SCP nucleic acid can be combined with a lipophilic carrier such as any one of a number of sterols including cholesterol, cholate and deoxycholic acid. A preferred sterol is cholesterol.

The PNS SCP gene therapy nucleic acids and the pharmaceutical compositions of the invention can be administered by any means that achieve their intended purpose. For example, administration can be by parenteral, subcutaneous, intravenous, intramuscular, intra-peritoneal, or transdermal routes. The dosage administered will be dependent upon the age, health, and weight of the recipient, kind of concurrent treatment, if any, frequency of treatment, and the nature of the effect desired.

Compositions within the scope of this invention include all compositions wherein the PNS SCP antisense oligonucleotide is contained in an amount effective to achieve enhanced expression of at least one PNS SCP in a peripheral nervous system neuron or ganglion. While individual needs vary, determination of optimal ranges of effective amounts of each component is with the skill of the art. Typically, the PNS SCP nucleic acid can be administered to mammals, e.g. humans, at a dose of 0.005 to 1 mg/kg/day, or an equivalent amount of the pharmaceutically acceptable salt thereof, per day of the body weight of the mammal being treated.

Suitable formulations for parenteral administration include aqueous solutions of the PNS SCP nucleic acid in water-soluble form, for example, water-soluble salts. In addition, suspensions of the active compounds as appropriate oily injection suspensions can be administered. Suitable lipophilic solvents or vehicles include fatty oils, for example, sesame oil, or synthetic fatty acid esters, for example, ethyl oleate or triglycerides. Aqueous injection suspensions can contain substances which increase the viscosity of the suspension include, for example, sodium carboxymethyl cellulose, sorbitol, and/or dextran. Optionally, the suspension can also contain stabilizers.

Alternatively, at least one PNS SCP can be coded by DNA constructs which are administered in the form of virions, which are preferably incapable of replicating in vivo (see, for example, Taylor, WO 92/06693). For example, such DNA constructs can be administered using herpes-based viruses (Gage et al., U.S. Pat. No. 5,082,670). Alternatively, PNS SCP antisense RNA sequences, PNS SCP ribozymes, and PNS SCP EGS can be coded by RNA constructs which are administered in the form of virions, such as recombinant, replication deficient retroviruses or adenoviruses. The preparation of retroviral vectors is well known in the art (see, for example, Brown et al., "Retroviral Vectors," in DNA Cloning. A Practical Approach, Volume 3, IRL Press, Washington, D.C. (1987)).

Specificity for gene expression in the peripheral nervous system can be conferred by using appropriate cell-specific regulatory sequences, such as cell-specific enhancers and promoters. Since protein phosphorylation is critical for neuronal regulation (Kennedy, "Second Messengers and Neuronal Function," in An Introduction to Molecular Neurobiology, Hall, Ed., Sinauer Associates, Inc. (1992)), protein kinase promoter sequences can be used to achieve sufficient levels of PNS SCP gene expression.

Thus, gene therapy can be used to alleviate sodium channel related pathology by inhibiting the inappropriate expression of a particular form of PNS SC. Moreover, gene therapy can be used to alleviate such pathologies by providing the appropriate expression level of a particular form of PNS SCP. In this case, particular PNS SCP nucleic acid sequences can be coded by DNA or RNA constructs which are administered in the form of viruses, as described above.

Having now generally described the invention, the same will be more readily understood through reference to the following Examples which are provided by way of illustration, and are not intended to be limiting of the invention, unless specified.

EXAMPLE 1 Cloning and Sequencing of a PNS SC Encoding Nucleic Acid

Materials and Methods

Cell Culture. PC12 cells and PKI-4 PC12 subclones were grown as previously described (Mandel et al., 1988). NGF (2.5 S subunit, kindly supplied by Dr. S. Halegoua, SUNY at Stony Brook), was added to the culture medium at final concentration of 110 ng/ml. The PKI-4 PC12 subclone which expresses the cAMP-dependent kinase inhibitor protein (PKI) was also provided by Dr. S. Halegoua (see D'Arcangelo et al., J. Cell Biol. 122:915-921 (1993)).

PCR Amplification. Total cellular RNA was isolated, according to the method of Cathala et al. DNA 2:329-335 (1983), from a PC12 subclone (PKI-4) which expresses high levels of the cAMP-dependent protein kinase inhibitor protein. Two μg of total RNA prepared time NGF-treated PKI-4 cells was used to synthesize first strand cDNA using random hexamer primers for the reverse transcriptase reaction. The cDNA then served as template for the PCR amplification, using a pair of degenerate oligonucleotide primers that specified a 400 base pair region within repeat domain III of the sodium channel α subunit gene. The 5' primer (designated YJ1:GCGAAGCTT(TC)TIATITT(TC)I(GATC)IAT(ATC)ATGGG (SEQ ID NO:3), underline indicates a HindIII restriction site), corresponded to amino acids FWLIFSIM (SEQ ID NO:4) at positions 1347-1354 in the type 11 sodium channel gene. The 3' primer (designated YO1C: GCAGGATCC (AG)TT(AG)AAA(AG)TT(AG)TC(AGT)AT(AGT)AT(AGCT)AC(AGCT)CC (SEQ ID NO:5), underline indicates a BamH1 restriction site) corresponded to amino acids GVIIDNFN (SEQ ID NO:6) at positions 1470-1447 in the type II gene. The amplification reaction mixture consisted of 5% of the cDNA, 1 mM MgCl₂, 0.2 mM dNTPSs, 0.5 μM each primer, Taq polymerase (Perkin-Elmer) in a buffer consisting of 0.1 M KCl, 0.1 M TRIS HCl (pH 8.3) and gelatin (1 mg/ml). The reaction was performed in a Perkin-Elmer thermocycler as follows: 5 cycles of denaturation (94° C., 1 min.), annealing (37° C., 1 min.), and extension (72° C., 1 min) followed by 25 cycles of denaturation (94° C., 1 min.), annealing (50° C., 1 min.) and extension (72° C., 1 min.). The PCR products were excised from a low melt agarose gel (SEAPLAQUE GTG, FMC BIOPRODUCTS) and subcloned into a Bluescript II SK plasmid vector previously restricted with HindIII and BamH1. The clones were screened for cDNA inserts by miniprep (Sambrook et al., infra) and sequenced in both directions by dideoxy chain termination (Sequenase 2.0 kit, UNITED STATES BIOCHEMICAL). Sequence data was compiled and analyzed using GENWORKS software (INTELLIGENETICS, INC., Mountain View, Calif.).

cDNA Library Construction and Screening. Poly(A)+ mRNA from the PKI-4 PC12 subclone was purified (mRNA purification kit, PHARMACIA) and used to construct a random- and oligo (dT)-primed Lambda ZAP II cDNA library (STRATAGENE CORP., La Jolla, Calif.). The library consisted of 5.6×10⁶ independent clones prior to amplification. Screening of approximately 4×10⁶ recombinants using the cloned PCR product pPC 12-1 labeled by random primers (PHARMACIA kit) resulted in isolation of 5 cDNAs ranging in size from 1-3 kb. Sequence analysis and comparison to published sequences established that the two of the cDNAs together encoded 3033 bp of the novel sodium channel α subunit, PN 1.

Northern blot analysis and ribonuclease protection assays. Total cellular RNA was isolated from adult Sprague-Dawley rat brain, spinal cord, superior cervical ganglion, dorsal root ganglion, skeletal muscle, cardiac muscle, and adrenal gland using the standard method of Chirgwin, Biochemistry 18:5294-5299 (1979). RNA was electrophoresed and transferred to nylon membrane as previously described (Cooperman et al., Proc. Nat'l Acad Sci. USA 84:8721 (1987)) (DURALON-UV; STRATAGENE CORP.). RNA blots were cross-linked to the nylon using Stratalinker UV crosslinker (STRATAGENE CORP.) and hybridized to ³² P-UTP-labeled antisense RNA probes generated from the following linearized templates: pPC 12-1, pRB211 (Cooperman, infra, 1987), p1B15 (cyclophilin; Danielson et al., DNA 7:261-267 (1988)), and rat brain type 1, which contains 51 bp of intron, 5' untranslated sequence and 267 bp of coding sequence of the type I sodium channel. RNA probes were transcribed with either T3 (pPC12-1), T7, (pNach1), or SP6 (pRB211, p1B15) RNA polymerase according to the manufacturer's instructions (PROMEGA CORP, Madison, Wis.). The blots were washed once in 2× SSC, 0.1% NaDodSO₄ for 15 min. at 68° C., followed by two washes in 0.2× SSC, 0.1% NaDodSO₄ for 15 min. at 68° C. Autoradiography with preflashed XAR-5 film (EASTMAN KODAK CO., Rochester, N.Y.) was used for quantitation of mRNA by densitometry.

Ribonuclease protections assays were performed by use of a kit (RPA II, AMBION INC., Austin, Tex.). Total RNA was hybridized with 10⁴ cpm of antisense RNA probe generated from pPC12-1. To control for differences in the amount of total RNA between samples, we included an antisense RNA probe for β actin, transcribed from pTRI-β-actin (AMBION, INC.).

In situ hybridization. Tissue preparation and hybridization were performed using a modification of the procedure described by Yokouchi et al., Develop. 113:431-444 (1991). SCG and DRG were dissected from adult Sprague-Dawley rats and fixed in 4% paraformaldehyde (in 0.1 M PBS) for 2-6 hrs. at 4° C. The tissue was then rinsed ≈5 min. in 0.1 M PBS (pH 7.3), cryoprotected in 30% sucrose (in 0.1 M PBS) for 2 hrs. at 4° C. and embedded in O.C.T. (TISSUE-TEK). Cryostat sections (14 μM) were collected on SUPERFROST/Plus slides (FISHER SCIENTIFIC), dried ≈2 hrs. at room temp., and then stored at -80° C.

Immediately before prehybridization, sections were brought to room temp. and rehydrated in 0.1M PBS (pH 7.3) containing 0.3% Triton X-100 for 5 min. Sections were then treated with 0.2 N HCl for 20 min., washed in 0.1 M PBS for 5 min., and digested with proteinase K (5 μg/ml in 0.1 M PBS) for 40 min. at 37° C. Sections were then postfixed with 4% paraformaldehyde (in 0.1 M PBS), rinsed with 0.1 M PBS containing 0.1 M glycine for 15 min., and equilibrated in 50% formamide, 2× SSC for 1 hr. (room temp.).

Sections were hybridized with antisense digoxigenin-labeled RNA probes transcribed from pPC12-1 or pNach2 (Cooperman et al., Proc. Nat'l Acad Sci. USA 84:8721 (1987)) according to the manufacturer's instructions for RNA labeling with digoxigenin-UTP (BOEHRINGER MANNHEIM). Unlabeled probes were synthesized by replacing digoxigenin-UTP with rUTP. Each section was covered with ≈100 μl of hybridization solution containing 20 mM TRIS HCl (pH 8.0), 2.5 mM EDTA, 50% formamide, 0.3 M NaCl, 1× Denhardt's, 10% dextran sulfate, 1 mg/ml tRNA, and probe at a concentration of 0.7 μg/ml. Sections were then covered with PARAFILM coverslips and incubated in a humid chamber overnight at 45° C. After hybridization, sections were washed in 50% formamide, 2× SSC at 45° C. for 1 hr., followed by RNase digestion in 0.5M NaCl, 10 mM TRIS HCl (pH 8.0), and 20 μg/ml RNase A (BOEHRINGER MANNHEIM). Sections were subsequently washed at 45° C. in 50% formamide, 2× SSC for 1 hr., and 50% formamide, 1× SSC for 1 hr.

Immunological detection was performed using a kit (GENIUS 3 KIT, BOEHRINGER MANNHEIM), according to the manufacturer's instructions. In most experiments, the sections were incubated in the color solution for ≈3-5 hrs. at room temp. Sections were then coverslipped with AQUA-MOUNT (Lerner Laboratories) and stored in the dark.

Densitometry. Levels of sodium channel mRNA were determined by densitometric analysis of the autoradiograms using Bio Image software (Millipore Corp., Ann Arbor, Mich.). Levels of RNA were normalized to the quantitated levels of cyclophilin mRNA.

Results

Isolation of a cDNA expressed preferentially in peripheral nerve. D'Arcangelo et al., J. Cell Biol. 122:915-921 (1993) showed previously that NGF treatment of PC12 cells increase the level of an ≈11 kb sodium channel gene transcript which did not hybridize to probes specific for any of the known sodium channel genes. A transcript identical in size was also detected in mRNA from adult rat sympathetic and sensory ganglia, but not in mRNA from brain. These results suggested that the transcript encoded a new member of the sodium channel gene family (termed Peripheral Nerve type 1 (PN1)).

To confirm the identity of the PN1 gene, cDNAs from an NGF-treated PC12 subclone which preferentially expresses PN1 mRNA (PKI-4 cells) D'Arcangelo et al. were amplified by the polymerase chain reaction (PCR), using a pair of degenerate oligonucleotide primers that specify a 400 base pair (bp) region of the sodium channel α subunit gene (see Methods, FIG. 1). Both primers specified putative membrane-spanning regions within repeat domain III, which are highly conserved among voltage-gated sodium channels. The amplified regions between the primers include the strictly-conserved pore-lining residues, as well as residues which are divergent among the different mammalian a subunits. Sequence analysis of the PCR products revealed a cDNA, pPCI2-1, which encoded a portion of a novel putative sodium channel α subunit (FIG. 1). Additional cDNAs were further isolated which encapsulated the entire PN1 coding region.

To determine whether pPC12-1 encode part of the PN1 gene, the cDNA was used to generate antisense RNA probes for Northern blot analysis of mRNA from control and NGF-treated PC12 cells (FIG. 2B). For comparison, a duplicate blot (FIG. 2A) was hybridized with an antisense probe pRB211, which encode a highly-conserved region of the sodium channel a subunit (Cooperman et al., Proc. Nat'l Acad Sci. USA 84:8721 (1987)) and which cross-hybridizes with the PN1 transcript, and that, as shown by D'Arcangelo et al, J. Cell Biol. 122:915-921 (1993), levels of the detected transcript should increase rapidly and transiently following NGF treatment (maximal≈5 hrs). Comparison of FIGS. 2A and 2B shows that pPC12-1 fulfilled both of these criteria. Also, consistent with D'Arcangelo et al., J. Cell Biol. 122:915-921 (1993), we found that NGF induction of the transcript detected by pPC12-1 is independent of cAMP-dependent protein kinase activity.

To isolate additional cDNAs encoding PN1, a random- and oligo (dT)-primed Lambda ZAP II cDNA library (STRATAGENE, 5.6×10⁶ independent clones) was prepared from poly(A)+mRNA isolated from the same PC12 subclone from which pPC12-1 was isolated. Screening 4×10⁴ recombinants with a probe generated from pPC12-1 resulted in isolation of 2 additional, overlapping cDNAs which are joined to give a 3033 bp cDNA (FIG. 7). Additional cDNAs were further isolated which encapsulated the entire PN1 coding region.

Analysis of the deduced primary structure of PN1. As shown in FIG. 8, the deduced primary structure of PN1 encodes repeat domain II of the sodium channel α subunit gene. Comparison with the type II sodium channel shows that the PN1 sequence contains all of the structural motifs characteristic of voltage-gated sodium channels, including six putative transmembrane domains (IIIS1-IIIS6). The S4 domain, thought to serve as the voltage sensor, exhibits the highly-conserved pattern of a positively-charged residue (lysine or arginine) at every third position. Furthermore, the putative pore-lining segments (IIISS1-IIISS2) contain residues shown to be involved in sodium-selective permeation (laeinemann et al., Nature 356:441-443 (1992)) as well as TTX afftnity (Terlaue et al., FEBS Lett. 293:93-96 (1991)).

In addition to such highly-conserved structural features, the sodium channel αsubunit undergoes several characteristic post-translational modifications. All sodium channels sequenced to date exhibit a distinctive pattern of asparagine-linked (N-linked) glycosylation sites, which are found almost exclusively in the extracellular loops joining the S5 and S6 transmembrane helices. The N-linked glycosylation sites of PN1 are in good agreement with this pattern; three potential extracellular glycosylation sites are located between IIIS5 and IIIS6. Two of the sites are also found in the types I, II and III sodium channels.

The α subunit is phosphorylated by protein kinase C (PKC), and deduced PN1 sequence contains the highly-conserved consensus PKC phosphorylation site FIGS. 1A-B. This residue is located in the cytoplasmic loop joining domains III and IV that has been implicated in channel inactivation, and mutational analysis has shown that this serine is required for PKC modulation of channel inactivation (West et al, 1991).

The entire DNA (FIGS. 9A-C) and amino acid (FIG. 10) sequences were determined. The rat PN1 amino acid sequence was compared with new human sequences (FIGS. 11A-F) presented in Example 2.

In sum, the deduced primary structure of PN1 contains all of the hallmark structural and functional domains characteristics a ce subunit the voltage-gated sodium channel.

The PN1 gene is expressed preferentially in the PNS. To determine whether the PN1 gene was expressed preferentially in the PNS, total RNA was isolated from adult rat brain, spinal cord, SCG, DRG, skeletal muscle, and cardiac muscle and subjected to Northern blot analysis. Blots were hybridized with the PN1-specific anti sense probe generated from pPC12-1. As shown in FIG. 3A, we found high levels of hybridization to an ≈11 kb transcript in both SCG and DRG. Much lower, but detectable levels hybridization were seen to transcripts in both spinal cord and brain. No detectable hybridization was observed to mRNA from skeletal muscle, cardiac muscle, or liver.

Ribonuclease (RNase) protection analyses were also prepared. Total RNA was isolated from the same tissues used in Northern blot analysis, as well as adrenal gland, and hybridized to PN1-specific antisense probe (pPC12-1). mRNA from SCG, DRG, brain, spinal cord, and adrenal gland protected a 343 bp fragment of the PN1 probe (FIG. 4B). The non-protected bases represent oligonucleotide primer and plasmid sequences. The PN1 probe was not protected by mRNA from either skeletal muscle or cardiac muscle.

To determine the relative amounts of PN1 mRNA in the various tissues, autoradiographs from three separate RNase protection experiments were analyzed by densitometry. To control for small differences in the amount of total RNA between samples, we included a probe for a β actin. PN1 mRNA levels in both SCG and DRG are approximately 40-fold greater than in spinal cord, adrenal gland and brain.

The PN1 gene is expressed in sympathetic and sensory neurons. To determine whether the PN1 gene is expressed in neurons of peripheral ganglia, in situ hybridization was used to examine the cellular distribution of PN1 mRNA in adult rat SCG and DRG. Cryostat sections were hybridized with a PN1-specific digoxigenin-labeled RNA probe (pPC12-1), which was visualized using an anti-digoxigenin antibody conjugated to alkaline phosphatase. As shown in FIG. 4A, B the PN1 antisense probe labeled most neuronal cell bodies in both SCG and DRG. To confirm that the hybridization signal was due to binding of the probe specifically to PN mRNA, we performed two different negative controls: (1) Sections were hybridized with the digoxigenin-labeled probe in the presence of a 100-fold excess of unlabeled PN1 antisense probe. (2) Previous experiments have shown that SCG and DRG contain extremely low levels of type II sodium channel mRNA (Beckh, S., FEBS Lett. 262:317-322 (1990)). Therefore, we also hybridized sections with a type II-specific antisense probe. As shown, in FIGS. 4C-F, both of these control experiments greatly reduced the hybridization signal. Also, consistent with the results of Northern blot and RNase protection analyses, we found that hybridization of the labeled PN1 probe to sections of adult rat cerebral cortex yielded no detectable staining.

Although the PN1 probe stained most neuronal cell bodies in both SCG and DRG, we found that cell-to-cell variability in PN1 mRNA levels differed between the two ganglia. SCG neurons were fairly homogeneous, in that the intensity of reaction product was relatively constant between different cells. DRG neurons, however, were quite heterogeneous in that the staining intensity varied considerably from cell to cell. For example, in FIG. 4B, arrows indicate two DRG neurons of approximately the same diameter which differ markedly in staining intensity.

Finally, we found that the PN2 probe did not stain non-neuronal cells such as satellite cells and Schwann cells. However, it is possible that these cells contain very low levels of PN1 mRNA which are not detectable by this method.

SCG neurons also express the type I sodium channel gene. Earlier Northern blot analysis has shown that mRNA from SCG contains two distinct sodium channel gene transcripts. As we have demonstrated, the larger, 11 kb transcript encodes the PN1 sodium channel. The smaller transcript, however, has not yet been identified. We hypothesized that this smaller transcript encoded the type I sodium channel, because moderate levels of type I mRNA have been found in other PNS tissues (Beckh, S., FEBS Lett. 262:317-322 (1990)). To test this hypothesis, Northern blots of SCG mRNA isolated from adult rats were hybridized with an antisense probe specific for the type I sodium channel gene (pNach1, see Methods above). As shown in FIG. 5, the type I-specific probe hybridized specifically to the smaller transcript. Furthermore, we have found that SCG mRNA protects the type I probe in an RNas protection assay.

The putative PN1α subunit and type Iα subunit genes are differentially regulated during development. Several studies have shown that the types I, II and III sodium channel genes are differentially regulated during development in both the central and peripheral nervous systems. To determine whether the PN1 and type I genes are also independently regulated during development, we measured their relative mRNA levels in SCG isolated from rats of different postnatal ages. To visualize both transcripts simultaneously, Northern blots were hybridized with the conserved sodium channel gene probe pRB2 11. As shown in FIG. 6A, in SCG removed on postnatal day 7 (P7), the levels of PN1 and type I mRNA are approximately equal. However, by P14, their relative abundance has shifted such that level of PN1 mRNA exceeds that of type I by ≈*-fold. This increase in ratio of PN1 to type I mRNA levels continues for at least the next four postnatal weeks. By P42, PN1 is the predominant sodium channel gene transcript, with levels of PN1 mRNA several-fold greater than that of type I.

To quantitate the development changes in mRNA levels, autoradiographs from three separate experiments were analyzed by densitometry. To control for differences in the amount of total RNA between lanes, blots were subsequently hybridizing blots with a probe for the internal control cyclophilin. As shown in FIG. 6B, in which percent maximum mRNA is plotted versus postnatal age, the shift in relative abundance of the two transcripts in largely due to a developmental decrease in level of type I sodium channel mRNA. From P7 to P42, the level of type I mRNA decreases by approximately 80%.

EXAMPLE 2 Drug Screening for PN-1 Antagonists

The ability of a PNS SCP-ligand (e.g., antagonists and agonists) to inhibit or enhance the activity of a PNS SCP is be evaluated with cells expressing at least one PNS SCP. An assay for PNS SCP activity in such cells is used to determine the functionality of the PNS SCP protein in the presence of at least one agent which can act as antagonist or agonist, and thus, agents that interfere or enhance the activity of PNS SCP are identified. Two or more cell lines (each expressing a different PNS SCP) are used, as well as optionally using one or more cell lines expressing a CNS specific sodium channel as a control.

These agents are selected and screened (1) at random; (2) by a rational selection; and or (3) by design using for example, computer modeling techniques.

There are numerous variations of assays which can be used by a skilled artisan without the need for undue experimentation in order to isolate, modulating agents or ligands of a PNS SCP. Agent determination methods include Computer Assisted Molecular Design (CAMD), PNS SCP-agent binding, sophisticated chemical synthesis and testing, targeted screening, peptide combinatorial library technology, antisense technology and/or biological assays, according to known methods. See, e.g., Rapaka et al., eds., Medications Development: Drug Discovery, Databases, and Computer-Aided Drug Design, NIDA Research Monograph 134, NIH Publication No. 93-3638, U.S. Dept. of Health and Human Services, Rockville, Md. (1993); Langone, Methods in Enzymology, Volume 203, Molecular Design and Modeling:Concepts and Applications, Part B, Antibodies and Antigens, Nucleic Acids, Polysaccharides and Drugs, Section III, pp 587-702, Academic Press, New York (1991)).

Alternatively, cell expression libraries, or other cells are used to that have been selected or genetically engineered to express and display a PNS SCP via the use of the PNS SCP nucleic acids of the invention are preferred in such methods, as host cell lines may be chosen which are devoid of related receptors. Rapaka, infra, (1993), at pages 58-65.

A PNS SCP agent in the context of the present invention refers to any chemical or biological molecule that associates with a PNS SCP in vitro, in situ or in vivo, and can be, but is not limited to, synthetic, recombinant or naturally derived chemical compounds and compositions, e.g., organic compounds, nucleic acids, peptides, carbohydrates, vitamin derivatives, hormones, neurotransmitters, viruses or receptor binding domains thereof, opsins, rhodopsins, nucleosides, nucleotides, coagulation cascade factors, odorants or pheremones, toxins, growth factors, platelet activating factors, neuroactive peptides, neurohumors, or any biologically active compound, such as drugs or naturally occurring compounds.

The agents are selected and screened at random or rationally selected or designed using computer modeling techniques. For random screening, potential agents are selected and assayed for their ability to bind to the PNS SCP, or a fragment thereof. Alternatively, agents may be rationally selected or designed. As used herein, a agent is said to be "rationally selected or designed" when the agent is chosen based on the configuration of at least one specific PNS SCP (e.g., as presented in FIG. 11). For example, one skilled in the art can readily adapt currently available procedures to generate agents capable of binding to a specific peptide sequence in order to generate rationally designed compounds, such as chemical compounds, nucleic acids or peptides. See, e.g., Rapaka, infra, (1993); Hurby et al., "Application of Synthetic Peptides: Antisense Peptides," in Synthetic Peptides: A User's Guide, W.H. Freeman, New York (1992), pp. 289-307; and Kaspczak et al, Biochemistry 28:9230-2938 (1989).

A method of screening for an agent that modulates the activity of at least one PNS SCP comprising:

(a) incubating at least one cell line expressing at least one PNS SCP with an agent to be tested; and

(b) assaying the at least one cell for the activity of the at least one PNS SCP protein by measuring the agents effect on PNS SCP binding or PNS SCP activity preferably the or assay distinguishes the agent's effect on alternative PNS SCP and determines that the agent has little or no effect on CNS sodium channels, or has relatively less effect on CNS sodium channels.

Any cell can be used in the above assay so long as it expresses a functional form of PNS SCP protein and the PNS SCP activity can be measured. The preferred expression cells are eukaryotic cells or organisms. Such cells can be modified to contain DNA sequences encoding the PNS SCP protein using routine procedures known in the art. Alternatively, one skilled in the art can introduce mRNA encoding the PNS SCP protein directly into the cell.

In an alternative embodiment stem cell populations for either neuronal or glial cells can be genetically engineered to express a functional PNS SCP ion channel. Such cells expressing the PNS SCP ion channel, can be transplanted to the diseased or injured region of the mammal's neurological system (Neural Transplantation. A Practical Approach, Donnet & Djorklund, eds., Oxford University Press, New York, N.Y. (1992)). In another embodiment, embryonic tissue or fetal neurons can be genetically engineered to express functional PNS SCP ion channel and transplanted to the diseased or injured region of the mammal's limbic system. The feasibility of transplanting fetal dopamine neurons into Parkinsonian patients has been demonstrated. (Lindvall el al., Archives of Neurology 46:615-631 (1989)).

At least two types of approaches are currently used to express voltage-dependent sodium channel clones in order to generate functional channel proteins. In one approach, mRNA encoding the cloned cDNA is expressed in Xenopus oocytes. The sodium channel cDNA is cloned into a bacterial expression vector such as the pGEM recombinant plasmid (Melton, et al., 1984). Transcription of the cloned cDNA is carried out using an RNA polymerase such as SP6 polymerase or T7 polymerase with a capping analog such as M⁷ G(5')ppp(5')G. The resulting RNA (e.g., about 50 nl, corresponding to 2-5 ng) is injected into stage V and stage VI oocytes isolated from Xenopus, and incubated for 3-5 days at 19° C. Oocytes axe tested for sodium channel expression with a two-microelectrode voltage clamp (Trimmer et al, Neuron 3:33-49 1989).

In an alternative approach, cDNAs encoding a voltage-dependent sodium channel is cloned into any one of a number of mammalian expression vectors, and transfected into mammalian cells which do not express endogenous voltage-dependent sodium channels (such as fibroblast cell lines). Transfected clones are selected expressing the cloned, transfected cDNA. Sodium channel expression is measured with a whole cell voltage clamp technique using a patch electrode (D'Arcangelo et al., J. Cell. Biol. 122:915-921 (1993)).

Sources of PNS SCPs and Cell Lines Useful for Drug Screening. Any cell line expressing (Naturally, by induction or due to recombinant expression of a PNS SCP) can be used for drug screening. As a non-limiting example, PCI2 cells express both PNI and Type II sodium channels. A126-1B2 cells are mutants deficient in Protein Kinase A (PKA) activity and which express PN1, but are now discovered to not express Type II sodium channels. PKI-4 is a PC 12 cell line transfected with a cDNA encoding a peptide inhibitor of PKA. Each of these cell lines can be used as one source of a PNS SCP of the present invention, or as a cell line itself to use in drug screening. Treatment of PC12 cells with NGF reduces both a PNS SCP (PN1) and type II sodium channels, while NGF induces only PN1 in A126-182 cells. PKI-4 cells express a PNS SCP (PN1) without NGF treatment. (D'Arcangelo el al., J. Cell Biol. 122:915-921 (1993)).

Additionally or alternatively, heterologous expression systems can also be used in which cell lines (such as Chinese Hamster Ovary cells (CHO)) are stably transfected with a cDNA encoding PN-1. Method steps for transfecting and stably expressing cDNA to form heterologous cell lines, are well known in the art. An advantage of using transfected cells is that clones are obtained that express very high levels of a PNS SCP, such as PN-1.

To screen for PNS SCP modulators, as antagonists or agonists, drugs are examined for their ability to:

(a) inhibit or enhance the binding of radioligands to a PNS SCP (labeled ligand binding reaction), and/or

(b) to inhibit or enhance ion flux through the channel of the PNS SCP in a cell line that expresses a PNS SCP.

Labeled ligand binding neurotoxins can be used to characterize PNS sodium channels. For example previous studies have identified at least six distinct neurotoxin binding sites on previously characterized non-PNS sodium channels (reviewed in Lombert et al, FEB 219(2):355-359 (1987)). Many of these sites are thought to be allosterically coupled to one another (for review, see Strichartz et al., Ann. Rev. Neurosci. 10:237-267 (1987), and references cited therein). In other words, binding of a drug or toxin to a particular neurotoxin site can be sensitive to drug binding at not only that site, but other sites on the channel as well. This is advantageous for a drug screening program in that for a given labeled ligand, the likelihood of identifying agents that preferrentially bind to a PNS SCP is increased.

The techniques described herein for measuring labeled ligand binding to a PNS SCP of the invention in intact cells (e.g., PC12 PKI or PNS SCP expressing heterologous cell lines) in suspension are similar to those described previously for radioligand binding to other sodium channels in brain synaptosomal preparations (see, e.g., Catterall et al, J. Biol. Chem. 256(17):8922-8927 (198 1)). However, it is well recognized by those skilled in the art that these techniques are routinely modified for the use of substrate-attached cells or broken cell preparations, based on the teaching and guidance presented herein.

A126-1B2, PC12, PK1-4 or other cells expressing a PNS SCP cells are grown using standard techniques, and optionally treated with NGF for 1-2 days to induce PN-1 expression. Cells are harvested and tested for ion flux activity with alternative potential agents.

For both radioligands, binding reactions are conducted e.g., at 37° C., then stopped. Samples are quickly filtered with vacuum washed with ice-cold buffer, and bound radioactivity determined by scintillation counting.

Ion Flux directly tests the ability of a potential PNS SCP agent to inhibit or enhance the activity of a PNS SCP function, by their ability to inhibit or enhance the influx of ion tracers through a PNS SCP.

Most previous sodium channel studies have employed ²² Na as a tracer (for example, see Catterall et al., J. Biol. Chem. 256(17):8922-8927 (1981)). However, the high toxicity of ²² Na can be a disadvantage for its use in high-throughput drug screening. A less toxic alternative is (¹⁴ C) guanidimium ion, influx of which has been shown to be a reliable indicator of sodium channel opening (Reith, Europ. J. Pharmacol. 188:33-41 (1990)). Accordingly, routine methods can be used to screen compounds for modulating PNS SCP ion channel activity, e.g., (¹⁴ C) guanidimium ion flux using intact cells expressing at least one PNS SCP. Additionally these methods are well known to be easily modified for use with ²² Na. Similarly, these known method steps could be modified for use with substrate-attached cells or vesicles prepared from broken cells, according to known method steps.

For a guanidinium flux assay the methods for ² Na are modified from those of Reith (Europ. J. Pharmacol. 188:33-41 (1990) for brain synaptosomes), e.g., as described in Example 2 below. Aliquots of a cell suspension containing heterologous cells expressing at least one PNS SCP are incubated for 10 minutes at 37° C. in the presence of channel openers (typically, 100 μM veratridine) and test drugs in a total volume of 100 μM (0.20-0.25 mg protein). Ion flux is initiated by the addition of HEPESITRIS solution also containing 4 mM guanidine HCl (final) and 1000 dpm/nmol (¹⁴ C) guanidine. The reaction is continued for 30 seconds and is stopped by the addition of ice-cold incubation buffer, followed by rapid filtration under vacuum over Whatman GF/C filter. The filters are washed rapidly with ice-cold incubation buffer and radioactivity determined by scintillation counting. Nonspecific uptake is determined in parallel by the inclusion of 1 mM tetrodotoxin during both preincubation and uptake.

Using the guanidinium flux assay several methyl/halophenyl substituted compounds, such as lidoflazine (see, e.g., Merck Index Monograph 5311 and U.S. Pat. No. 3,267,104, both entirely incoporated herein by reference), were tested and found to inhibit sodium channel activity of at least one PNS SCP of the present invention in cell lines expressing at least one PNS SCP, with a pIC50 of 6.51 for lidoflazine on PK1-4 cells. Accordingly, the present invention provides PNS SCP modulating agents as methyl/halophenyl-substituted piperizines.

EXAMPLE 3 Identification of Human PNS SCP Sequence from a Human Peripheral Nervous System cDNA Library

Similar to the procedures provided in Example 1, a human peripheral nervous system cDNA library (as a human DRG library) was used for polymerase chain reaction (PCR) amplification. The PCR used a 5' primer corresponding to DNA encoding amino acids 604-611 of SEQ ID NO:2, and a corresponding 3' primer encoding amino acids 723-731 of SEQ ID NO:2.

The PCR reaction mixture consisted of 5% of the cDNA, 1 mM MgCl₂, 0.2 mM dNTPSs, 0.5 mM, each primer, Taq polymerase (Perkin-Elmer) in a buffer consisting of 0.1 M KCl, 0.1 M TRIS HCl (pH 8.3) and gelatin (1 mg/ml). The reaction was performed in a Perkin-Elmer thermocycler as follows: five cycles of denaturations (94° C., 1 min.), annealing (37° C., 1 min), and extension (72° C., 1 min.), followed by 25 cycles of denaturation (94° C., 1 min.), annealing (50° C., 1 min.), and extension (72° C., 1 min.).

The resulting PCR products provided a human amplified cDNA which encoded amino acids 646-658 of SEQ ID NO:2, as presented in FIGS. 11A-F.

EXAMPLE 4 Cloning and Sequencing of Human PN-1 Sequence from Human Dorsal Root Ganglion cDNA Library

As in Examples 1 and 3 above, additional PCR primers corresponding to SEQ ID NO:1 are used to isolate clones from the human DRG cDNA library which encompass the entire coding region of one or more human PNS SCPs of the present invention. A 5' primer includes the sequence 5'TTTGTGCCCCACAGACCCCAG3' (SEQ ID NO:17) and a 3' primer includes the sequence 5' ACACAAATTCTTGATCTGGAATTGCT3' (SEQ ID NO:18) or 5'CAACCTC AGACAGAGAG CAATGA 3' (SEQ ID NO:19), which are used for nested PCR. According to Examples 1 and 3 above, PCR is performed to obtain cDNAs encoding a human PNS SCP.

Additional PCR is performed by "walking" 5' or 3' of the sequence corresponding to the above PCR product. In this way cDNAs encompassing the entire coding region of one or more human PNS SCPs are provided.

The resulting additional cDNA clones or PCR products, encoding the entire human PNS SCP, are subcloned into a plasmid vector previously restricted with suitable restriction sites. The clones are screened for cDNA inserts by miniprep (Sambrook et al., infra) and sequenced in both directions by dideoxy chain termination (Sequenase 2.0 kit, United States Biochemical). Sequence data is compiled and analyzed using GeneWorks software (IntelliGenetics, Inc., Mountain View, Calif.). The expected alternative amino acid sequences for a human PN1 sequence or presented in FIGS. 11A-F and as SEQ ID NOS:7, 11 and 12, where Xaa represents 0, 1, 2 or 3 amino acids.

Transcripts of the size of the resulting human PNS SCP are then confirmed to be present in human PNS mRNA or cDNA (encoding a 1970-1990 amino acid sequence of FIGS. 11A-F). However, as in Example 1, such transcripts are not expected to be detected in mRNA from brain. This expected result confirms new human members of the sodium channel gene family (termed Human Peripheral Nerve type 1 HUMPNIA and HUMPN1B of FIGS. 11A-F, where X is 0, 1, 2 or 3 of the same or different amino acid).

Complete DNA and amino acid sequences of novel human PN1s are then confirmed and are expected to contain all of the structural and functional domain characteristics of an α subunit of a mammalian voltage-gated sodium channel.

All references cited herein, including journal articles or abstracts, published or corresponding U.S. or foreign patent applications, issued U.S. or foreign patents, or any other references, are entirely incorporated by reference herein, including all data, tables, figures, and text presented in the cited references. The foregoing description of the specific embodiments will so fully reveal the general nature of the invention that others can, by applying knowledge within the skill of the art (including the contents of the references cited herein), readily modify and/or adapt for various applications such specific embodiments, without undue experimentation, without departing from the general concept of the invention. Therefore, such adaptations and modifications are intended to be within the meaning and range of equivalents of the disclosed embodiments, based on the teaching and guidance presented herein. It is to be understood that the phraseology or terminology herein is for the purpose of description and not of limitation, such that the terminology or phraseology of the present specification is to be interpreted by the skilled artisan in light of the teachings and guidance presented herein, in combination with the knowledge of one of ordinary skill in the art.

    __________________________________________________________________________     #             SEQUENCE LISTING                                                 - (1) GENERAL INFORMATION:                                                     -    (iii) NUMBER OF SEQUENCES: 19                                             - (2) INFORMATION FOR SEQ ID NO:1:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 3033 base                                                          (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: both                                                         (D) TOPOLOGY: both                                                   -     (ii) MOLECULE TYPE: DNA (genomic)                                        -     (ix) FEATURE:                                                                      (A) NAME/KEY: CDS                                                              (B) LOCATION: 1..3033                                                -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                  - AGG AAC CTT GTG GTC CTG AAC CTG TTT CTG GC - #T CTT TTG CTG AGT TCC            48                                                                           Arg Asn Leu Val Val Leu Asn Leu Phe Leu Al - #a Leu Leu Leu Ser Ser            #                 15                                                           - TTT AGT TCT GAC AAT CTT ACA GCA ATT GAG GA - #A GAC ACC GAT GCA AAC            96                                                                           Phe Ser Ser Asp Asn Leu Thr Ala Ile Glu Gl - #u Asp Thr Asp Ala Asn            #             30                                                               - AAC CTC CAG ATC GCA GTG GCC AGA ATT AAG AG - #G GGA ATC AAT TAC GTG           144                                                                           Asn Leu Gln Ile Ala Val Ala Arg Ile Lys Ar - #g Gly Ile Asn Tyr Val            #         45                                                                   - AAA CAG ACC CTG CGT GAA TTC ATT CTA AAA TC - #A TTT TCC AAA AAG CCA           192                                                                           Lys Gln Thr Leu Arg Glu Phe Ile Leu Lys Se - #r Phe Ser Lys Lys Pro            #     60                                                                       - AAG GGC TCC AAG GAC ACA AAA CGA ACA GCA GA - #T CCC AAC AAC AAG AAA           240                                                                           Lys Gly Ser Lys Asp Thr Lys Arg Thr Ala As - #p Pro Asn Asn Lys Lys            # 80                                                                           - GAA AAC TAT ATT TCA AAC CGT ACC CTT GCG GA - #G ATG AGC AAG GAT CAC           288                                                                           Glu Asn Tyr Ile Ser Asn Arg Thr Leu Ala Gl - #u Met Ser Lys Asp His            #                 95                                                           - AAT TTC CTC AAA GAA AAG GAT AGG ATC AGT GG - #T TAT GGC AGC AGT CTA           336                                                                           Asn Phe Leu Lys Glu Lys Asp Arg Ile Ser Gl - #y Tyr Gly Ser Ser Leu            #           110                                                                - GAC AAA AGC TTT ATG GAT GAA AAT GAT TAC CA - #G TCC TTT ATC CAT AAC           384                                                                           Asp Lys Ser Phe Met Asp Glu Asn Asp Tyr Gl - #n Ser Phe Ile His Asn            #       125                                                                    - CCC AGC CTC ACA GTG ACA GTG CCA ATT GCA CC - #T GGG GAG TCT GAT TTG           432                                                                           Pro Ser Leu Thr Val Thr Val Pro Ile Ala Pr - #o Gly Glu Ser Asp Leu            #   140                                                                        - GAG ATT ATG AAC ACA GAA GAG CTT AGC AGT GA - #C TCA GAC AGT GAC TAC           480                                                                           Glu Ile Met Asn Thr Glu Glu Leu Ser Ser As - #p Ser Asp Ser Asp Tyr            145                 1 - #50                 1 - #55                 1 -        #60                                                                            - AGC AAA GAG AAA CGG AAC CGA TCA AGC TCT TC - #T GAG TGC AGC ACT GTT           528                                                                           Ser Lys Glu Lys Arg Asn Arg Ser Ser Ser Se - #r Glu Cys Ser Thr Val            #               175                                                            - GAC AAC CCT CTG CCA GGA GAA GAG GAG GCT GA - #A GCA GAG CCC GTA AAC           576                                                                           Asp Asn Pro Leu Pro Gly Glu Glu Glu Ala Gl - #u Ala Glu Pro Val Asn            #           190                                                                - GCA GAT GAG CCT GAA GCC TGC TTT ACA GAT GG - #T TGT GTG AGG AGA TTT           624                                                                           Ala Asp Glu Pro Glu Ala Cys Phe Thr Asp Gl - #y Cys Val Arg Arg Phe            #       205                                                                    - CCA TGC TGC CAA GTT AAT GTA GAC TCT GGG AA - #A GGG AAA GTT TGG TGG           672                                                                           Pro Cys Cys Gln Val Asn Val Asp Ser Gly Ly - #s Gly Lys Val Trp Trp            #   220                                                                        - ACC ATC AGG AAG ACG TGC TAC AGG ATA GTT GA - #A CAC AGC TGG TTT GAA           720                                                                           Thr Ile Arg Lys Thr Cys Tyr Arg Ile Val Gl - #u His Ser Trp Phe Glu            225                 2 - #30                 2 - #35                 2 -        #40                                                                            - AGC TTC ATC GTT CTC ATG ATC CTG CTC AGC AG - #T GGA GCT CTG GCT TTT           768                                                                           Ser Phe Ile Val Leu Met Ile Leu Leu Ser Se - #r Gly Ala Leu Ala Phe            #               255                                                            - GAA GAT ATC TAT ATT GAA AAG AAA AAG ACC AT - #T AAG ATT ATC CTG GAG           816                                                                           Glu Asp Ile Tyr Ile Glu Lys Lys Lys Thr Il - #e Lys Ile Ile Leu Glu            #           270                                                                - TAT GCT GAC AAG ATA TTC ACC TAC ATC TTC AT - #T CTG GAA ATG CTT CTA           864                                                                           Tyr Ala Asp Lys Ile Phe Thr Tyr Ile Phe Il - #e Leu Glu Met Leu Leu            #       285                                                                    - AAA TGG GTC GCA TAT GGG TAT AAA ACA TAT TT - #C ACT AAT GCC TGG TGT           912                                                                           Lys Trp Val Ala Tyr Gly Tyr Lys Thr Tyr Ph - #e Thr Asn Ala Trp Cys            #   300                                                                        - TGG CTG GAC TTC TTA ATT GTT GAT GTG TCT CT - #A GTT ACT TTA GTA GCC           960                                                                           Trp Leu Asp Phe Leu Ile Val Asp Val Ser Le - #u Val Thr Leu Val Ala            305                 3 - #10                 3 - #15                 3 -        #20                                                                            - AAC ACT CTT GGC TAC TCA GAC CTT GGC CCC AT - #T AAA TCT CTA CGG ACA          1008                                                                           Asn Thr Leu Gly Tyr Ser Asp Leu Gly Pro Il - #e Lys Ser Leu Arg Thr            #               335                                                            - CTG AGG GCC CTA AGA CCC CTA AGA GCC TTG TC - #T AGA TTT GAA GGA ATG          1056                                                                           Leu Arg Ala Leu Arg Pro Leu Arg Ala Leu Se - #r Arg Phe Glu Gly Met            #           350                                                                - AGG GTA GTG GTC AAC GCA CTC ATA GGA GCA AT - #C CCT TCC ATC ATG AAC          1104                                                                           Arg Val Val Val Asn Ala Leu Ile Gly Ala Il - #e Pro Ser Ile Met Asn            #       365                                                                    - GTG CTT CTC GTG TGC CTT ATA TTC TGG CTA AT - #A TTT AGC ATC ATG GGA          1152                                                                           Val Leu Leu Val Cys Leu Ile Phe Trp Leu Il - #e Phe Ser Ile Met Gly            #   380                                                                        - GTC AAT CTG TTT GCT GGC AAG TTC TAT GAG TG - #T GTC AAC ACC ACC GAT          1200                                                                           Val Asn Leu Phe Ala Gly Lys Phe Tyr Glu Cy - #s Val Asn Thr Thr Asp            385                 3 - #90                 3 - #95                 4 -        #00                                                                            - GGG TCA CGA TTT CCT ACA TCT CAA GTT GCA AA - #C CGT TCT GAG TGT TTT          1248                                                                           Gly Ser Arg Phe Pro Thr Ser Gln Val Ala As - #n Arg Ser Glu Cys Phe            #               415                                                            - GCC CTG ATG AAC GTT AGT GGA AAT GTG CGA TG - #G AAA AAC CTG AAA GTA          1296                                                                           Ala Leu Met Asn Val Ser Gly Asn Val Arg Tr - #p Lys Asn Leu Lys Val            #           430                                                                - AAC TTC GAC AAC GTT GGG CTT GGT TAC CTG TC - #G CTG CTT CAA GTT GCA          1344                                                                           Asn Phe Asp Asn Val Gly Leu Gly Tyr Leu Se - #r Leu Leu Gln Val Ala            #       445                                                                    - ACA TTC AAG GGC TGG ATG GAT ATT ATG TAT GC - #A GCA GTT GAC TCT GTT          1392                                                                           Thr Phe Lys Gly Trp Met Asp Ile Met Tyr Al - #a Ala Val Asp Ser Val            #   460                                                                        - AAT GTA AAT GAA CAG CCG AAA TAC GAA TAC AG - #T CTC TAC ATG TAC ATT          1440                                                                           Asn Val Asn Glu Gln Pro Lys Tyr Glu Tyr Se - #r Leu Tyr Met Tyr Ile            465                 4 - #70                 4 - #75                 4 -        #80                                                                            - TAC TTT GTC ATC TTC ATC ATC TTC GGC TCA TT - #C TTC ACG TTG AAC CTG          1488                                                                           Tyr Phe Val Ile Phe Ile Ile Phe Gly Ser Ph - #e Phe Thr Leu Asn Leu            #               495                                                            - TTC ATT GGT GTC ATC ATA GAT AAT TTC AAC CA - #A CAG AAA AAA AAG CTT          1536                                                                           Phe Ile Gly Val Ile Ile Asp Asn Phe Asn Gl - #n Gln Lys Lys Lys Leu            #           510                                                                - GGA GGT CAA GAT ATC TTT ATG ACA GAA GAA CA - #G AAG AAA TAC TAT AAT          1584                                                                           Gly Gly Gln Asp Ile Phe Met Thr Glu Glu Gl - #n Lys Lys Tyr Tyr Asn            #       525                                                                    - GCA ATG AAG AAG CTT GGG TCC AAA AAA CCA CA - #A AAA CCA ATT CCA AGG          1632                                                                           Ala Met Lys Lys Leu Gly Ser Lys Lys Pro Gl - #n Lys Pro Ile Pro Arg            #   540                                                                        - CCA GGG AAC AAA TTC CAA GGA TGT ATA TTT GA - #C TTA GTG ACA AAC CAA          1680                                                                           Pro Gly Asn Lys Phe Gln Gly Cys Ile Phe As - #p Leu Val Thr Asn Gln            545                 5 - #50                 5 - #55                 5 -        #60                                                                            - GCT TTT GAT ATC ACC ATC ATG GTT CTT ATA TG - #C CTC AAC ATG GTA ACC          1728                                                                           Ala Phe Asp Ile Thr Ile Met Val Leu Ile Cy - #s Leu Asn Met Val Thr            #               575                                                            - ATG ATG GTA GAA AAA GAG GGG CAA ACT GAG TA - #C ATG GAT TAT GTT TTA          1776                                                                           Met Met Val Glu Lys Glu Gly Gln Thr Glu Ty - #r Met Asp Tyr Val Leu            #           590                                                                - CAC TGG ATC AAC ATG GTC TTC ATT ATC CTG TT - #C ACT GGG GAG TGT GTG          1824                                                                           His Trp Ile Asn Met Val Phe Ile Ile Leu Ph - #e Thr Gly Glu Cys Val            #       605                                                                    - CTG AAG CTA ATC TCC CTC AGA CAT TAC TAC TT - #C ACT GTG GGT TGG AAC          1872                                                                           Leu Lys Leu Ile Ser Leu Arg His Tyr Tyr Ph - #e Thr Val Gly Trp Asn            #   620                                                                        - ATT TTG TAT TTT GTG GTA GTG ATC CTC TCC AT - #T GTA GGA ATG TTT CTC          1920                                                                           Ile Leu Tyr Phe Val Val Val Ile Leu Ser Il - #e Val Gly Met Phe Leu            625                 6 - #30                 6 - #35                 6 -        #40                                                                            - GCT GAG ATG ATA GAG AAG TAT TTC GTG TCC CC - #T ACC CTG TTC CGA GTC          1968                                                                           Ala Glu Met Ile Glu Lys Tyr Phe Val Ser Pr - #o Thr Leu Phe Arg Val            #               655                                                            - ATC CGC CTG GCC AGG ATT GGA CGA ATC CTA CG - #C CTG ATC AAA GGC GCC          2016                                                                           Ile Arg Leu Ala Arg Ile Gly Arg Ile Leu Ar - #g Leu Ile Lys Gly Ala            #           670                                                                - AAG GGG ATC CGC ACT CTG CTC TTT GCT TTG AT - #G ATG TCC CTT CCT GCG          2064                                                                           Lys Gly Ile Arg Thr Leu Leu Phe Ala Leu Me - #t Met Ser Leu Pro Ala            #       685                                                                    - CTG TTC AAC ATC GGC CTC CTG CTT TTC CTG GT - #C ATG TTC ATC TAC GCC          2112                                                                           Leu Phe Asn Ile Gly Leu Leu Leu Phe Leu Va - #l Met Phe Ile Tyr Ala            #   700                                                                        - ATC TTT GGG ATG TCC AAC TTT GCC TAC GTT AA - #A AAG GAG GCT GGA ATT          2160                                                                           Ile Phe Gly Met Ser Asn Phe Ala Tyr Val Ly - #s Lys Glu Ala Gly Ile            705                 7 - #10                 7 - #15                 7 -        #20                                                                            - AAT GAC ATG TTC AAC TTT GAG ACT TTT GGC AA - #C AGC ATG ATC TGC TTG          2208                                                                           Asn Asp Met Phe Asn Phe Glu Thr Phe Gly As - #n Ser Met Ile Cys Leu            #               735                                                            - TTC CAA ATC ACC ACC TCT GCC GGC TGG GAC GG - #A CTG CTG GCC CCC ATC          2256                                                                           Phe Gln Ile Thr Thr Ser Ala Gly Trp Asp Gl - #y Leu Leu Ala Pro Ile            #           750                                                                - CTC AAC AGC GCA CCT CCC GAC TGT GAC CCT AA - #A AAA GTT CAC CCA GGA          2304                                                                           Leu Asn Ser Ala Pro Pro Asp Cys Asp Pro Ly - #s Lys Val His Pro Gly            #       765                                                                    - AGT TCA GTG GAA GGG GAC TGT GGG AAC CCA TC - #C GTG GGG ATT TTT TAC          2352                                                                           Ser Ser Val Glu Gly Asp Cys Gly Asn Pro Se - #r Val Gly Ile Phe Tyr            #   780                                                                        - TTT GTC AGC TAC ATC ATC ATA TCC TTC CTG GT - #G GTG GTG AAC ATG TAC          2400                                                                           Phe Val Ser Tyr Ile Ile Ile Ser Phe Leu Va - #l Val Val Asn Met Tyr            785                 7 - #90                 7 - #95                 8 -        #00                                                                            - ATC GCT GTC ATC CTG GAG AAC TTC AGC GTC GC - #C ACC GAA GAG AGC ACT          2448                                                                           Ile Ala Val Ile Leu Glu Asn Phe Ser Val Al - #a Thr Glu Glu Ser Thr            #               815                                                            - GAG CCT CTG AGT GAG GAC GAC TTT GAG ATG TT - #C TAC GAG GTC TGG GAG          2496                                                                           Glu Pro Leu Ser Glu Asp Asp Phe Glu Met Ph - #e Tyr Glu Val Trp Glu            #           830                                                                - AAG TTC GAC CCT GAC GCC ACT CAG TTC ATA GA - #G TTC TGC AAG CTC TCT          2544                                                                           Lys Phe Asp Pro Asp Ala Thr Gln Phe Ile Gl - #u Phe Cys Lys Leu Ser            #       845                                                                    - GAC TTT GCA GCT GCC CTG GAT CCT CCC CTC CT - #C ATC GCA AAG CCA AAC          2592                                                                           Asp Phe Ala Ala Ala Leu Asp Pro Pro Leu Le - #u Ile Ala Lys Pro Asn            #   860                                                                        - AAA GTC CAG CTC ATT GCC ATG GAC CTG CCC AT - #G GTG AGT GGA GAC CGC          2640                                                                           Lys Val Gln Leu Ile Ala Met Asp Leu Pro Me - #t Val Ser Gly Asp Arg            865                 8 - #70                 8 - #75                 8 -        #80                                                                            - ATC CAC TGC CTG GAC ATC TTG TTT GCT TTT AC - #A AAG CGG GTC CTG GGT          2688                                                                           Ile His Cys Leu Asp Ile Leu Phe Ala Phe Th - #r Lys Arg Val Leu Gly            #               895                                                            - GAG GGT GGA GAG ATG GAT TCT CTT CGT TCA CA - #G ATG GAA GAA AGG TTC          2736                                                                           Glu Gly Gly Glu Met Asp Ser Leu Arg Ser Gl - #n Met Glu Glu Arg Phe            #           910                                                                - ATG TCA GCC AAT CCT TCT AAA GTG TCC TAT GA - #A CCC ATC ACG ACC ACA          2784                                                                           Met Ser Ala Asn Pro Ser Lys Val Ser Tyr Gl - #u Pro Ile Thr Thr Thr            #       925                                                                    - CTG AAG AGA AAA CAA GAG GAG GTG TCC GCG AC - #T ATC ATT CAG CGT GCT          2832                                                                           Leu Lys Arg Lys Gln Glu Glu Val Ser Ala Th - #r Ile Ile Gln Arg Ala            #   940                                                                        - TAC AGA CGG TAT CGC CTC AGA CAA CAC GTC AA - #G AAT ATA TCG AGT ATA          2880                                                                           Tyr Arg Arg Tyr Arg Leu Arg Gln His Val Ly - #s Asn Ile Ser Ser Ile            945                 9 - #50                 9 - #55                 9 -        #60                                                                            - TAC ATA AAA GAT GGA GAC AGG GAT GAT GAT TT - #G CCC AAT AAA GAA GAT          2928                                                                           Tyr Ile Lys Asp Gly Asp Arg Asp Asp Asp Le - #u Pro Asn Lys Glu Asp            #               975                                                            - ACA GTT TTT GAT AAC GTG AAC GAG AAC TCA AG - #T CCG GAA AAG ACA GAT          2976                                                                           Thr Val Phe Asp Asn Val Asn Glu Asn Ser Se - #r Pro Glu Lys Thr Asp            #           990                                                                - GTA ACT GCC TCA ACC ATC TCG CCA CCT TCC TA - #T GAC AGT GTC ACA AAG          3024                                                                           Val Thr Ala Ser Thr Ile Ser Pro Pro Ser Ty - #r Asp Ser Val Thr Lys            #      10050                                                                   #       3033                                                                   Pro Asp Gln                                                                        1010                                                                       - (2) INFORMATION FOR SEQ ID NO:2:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 1011 amino                                                         (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: protein                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                  - Arg Asn Leu Val Val Leu Asn Leu Phe Leu Al - #a Leu Leu Leu Ser Ser          #                 15                                                           - Phe Ser Ser Asp Asn Leu Thr Ala Ile Glu Gl - #u Asp Thr Asp Ala Asn          #             30                                                               - Asn Leu Gln Ile Ala Val Ala Arg Ile Lys Ar - #g Gly Ile Asn Tyr Val          #         45                                                                   - Lys Gln Thr Leu Arg Glu Phe Ile Leu Lys Se - #r Phe Ser Lys Lys Pro          #     60                                                                       - Lys Gly Ser Lys Asp Thr Lys Arg Thr Ala As - #p Pro Asn Asn Lys Lys          # 80                                                                           - Glu Asn Tyr Ile Ser Asn Arg Thr Leu Ala Gl - #u Met Ser Lys Asp His          #                 95                                                           - Asn Phe Leu Lys Glu Lys Asp Arg Ile Ser Gl - #y Tyr Gly Ser Ser Leu          #           110                                                                - Asp Lys Ser Phe Met Asp Glu Asn Asp Tyr Gl - #n Ser Phe Ile His Asn          #       125                                                                    - Pro Ser Leu Thr Val Thr Val Pro Ile Ala Pr - #o Gly Glu Ser Asp Leu          #   140                                                                        - Glu Ile Met Asn Thr Glu Glu Leu Ser Ser As - #p Ser Asp Ser Asp Tyr          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Ser Lys Glu Lys Arg Asn Arg Ser Ser Ser Se - #r Glu Cys Ser Thr Val          #               175                                                            - Asp Asn Pro Leu Pro Gly Glu Glu Glu Ala Gl - #u Ala Glu Pro Val Asn          #           190                                                                - Ala Asp Glu Pro Glu Ala Cys Phe Thr Asp Gl - #y Cys Val Arg Arg Phe          #       205                                                                    - Pro Cys Cys Gln Val Asn Val Asp Ser Gly Ly - #s Gly Lys Val Trp Trp          #   220                                                                        - Thr Ile Arg Lys Thr Cys Tyr Arg Ile Val Gl - #u His Ser Trp Phe Glu          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Ser Phe Ile Val Leu Met Ile Leu Leu Ser Se - #r Gly Ala Leu Ala Phe          #               255                                                            - Glu Asp Ile Tyr Ile Glu Lys Lys Lys Thr Il - #e Lys Ile Ile Leu Glu          #           270                                                                - Tyr Ala Asp Lys Ile Phe Thr Tyr Ile Phe Il - #e Leu Glu Met Leu Leu          #       285                                                                    - Lys Trp Val Ala Tyr Gly Tyr Lys Thr Tyr Ph - #e Thr Asn Ala Trp Cys          #   300                                                                        - Trp Leu Asp Phe Leu Ile Val Asp Val Ser Le - #u Val Thr Leu Val Ala          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Asn Thr Leu Gly Tyr Ser Asp Leu Gly Pro Il - #e Lys Ser Leu Arg Thr          #               335                                                            - Leu Arg Ala Leu Arg Pro Leu Arg Ala Leu Se - #r Arg Phe Glu Gly Met          #           350                                                                - Arg Val Val Val Asn Ala Leu Ile Gly Ala Il - #e Pro Ser Ile Met Asn          #       365                                                                    - Val Leu Leu Val Cys Leu Ile Phe Trp Leu Il - #e Phe Ser Ile Met Gly          #   380                                                                        - Val Asn Leu Phe Ala Gly Lys Phe Tyr Glu Cy - #s Val Asn Thr Thr Asp          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Gly Ser Arg Phe Pro Thr Ser Gln Val Ala As - #n Arg Ser Glu Cys Phe          #               415                                                            - Ala Leu Met Asn Val Ser Gly Asn Val Arg Tr - #p Lys Asn Leu Lys Val          #           430                                                                - Asn Phe Asp Asn Val Gly Leu Gly Tyr Leu Se - #r Leu Leu Gln Val Ala          #       445                                                                    - Thr Phe Lys Gly Trp Met Asp Ile Met Tyr Al - #a Ala Val Asp Ser Val          #   460                                                                        - Asn Val Asn Glu Gln Pro Lys Tyr Glu Tyr Se - #r Leu Tyr Met Tyr Ile          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Tyr Phe Val Ile Phe Ile Ile Phe Gly Ser Ph - #e Phe Thr Leu Asn Leu          #               495                                                            - Phe Ile Gly Val Ile Ile Asp Asn Phe Asn Gl - #n Gln Lys Lys Lys Leu          #           510                                                                - Gly Gly Gln Asp Ile Phe Met Thr Glu Glu Gl - #n Lys Lys Tyr Tyr Asn          #       525                                                                    - Ala Met Lys Lys Leu Gly Ser Lys Lys Pro Gl - #n Lys Pro Ile Pro Arg          #   540                                                                        - Pro Gly Asn Lys Phe Gln Gly Cys Ile Phe As - #p Leu Val Thr Asn Gln          545                 5 - #50                 5 - #55                 5 -        #60                                                                            - Ala Phe Asp Ile Thr Ile Met Val Leu Ile Cy - #s Leu Asn Met Val Thr          #               575                                                            - Met Met Val Glu Lys Glu Gly Gln Thr Glu Ty - #r Met Asp Tyr Val Leu          #           590                                                                - His Trp Ile Asn Met Val Phe Ile Ile Leu Ph - #e Thr Gly Glu Cys Val          #       605                                                                    - Leu Lys Leu Ile Ser Leu Arg His Tyr Tyr Ph - #e Thr Val Gly Trp Asn          #   620                                                                        - Ile Leu Tyr Phe Val Val Val Ile Leu Ser Il - #e Val Gly Met Phe Leu          625                 6 - #30                 6 - #35                 6 -        #40                                                                            - Ala Glu Met Ile Glu Lys Tyr Phe Val Ser Pr - #o Thr Leu Phe Arg Val          #               655                                                            - Ile Arg Leu Ala Arg Ile Gly Arg Ile Leu Ar - #g Leu Ile Lys Gly Ala          #           670                                                                - Lys Gly Ile Arg Thr Leu Leu Phe Ala Leu Me - #t Met Ser Leu Pro Ala          #       685                                                                    - Leu Phe Asn Ile Gly Leu Leu Leu Phe Leu Va - #l Met Phe Ile Tyr Ala          #   700                                                                        - Ile Phe Gly Met Ser Asn Phe Ala Tyr Val Ly - #s Lys Glu Ala Gly Ile          705                 7 - #10                 7 - #15                 7 -        #20                                                                            - Asn Asp Met Phe Asn Phe Glu Thr Phe Gly As - #n Ser Met Ile Cys Leu          #               735                                                            - Phe Gln Ile Thr Thr Ser Ala Gly Trp Asp Gl - #y Leu Leu Ala Pro Ile          #           750                                                                - Leu Asn Ser Ala Pro Pro Asp Cys Asp Pro Ly - #s Lys Val His Pro Gly          #       765                                                                    - Ser Ser Val Glu Gly Asp Cys Gly Asn Pro Se - #r Val Gly Ile Phe Tyr          #   780                                                                        - Phe Val Ser Tyr Ile Ile Ile Ser Phe Leu Va - #l Val Val Asn Met Tyr          785                 7 - #90                 7 - #95                 8 -        #00                                                                            - Ile Ala Val Ile Leu Glu Asn Phe Ser Val Al - #a Thr Glu Glu Ser Thr          #               815                                                            - Glu Pro Leu Ser Glu Asp Asp Phe Glu Met Ph - #e Tyr Glu Val Trp Glu          #           830                                                                - Lys Phe Asp Pro Asp Ala Thr Gln Phe Ile Gl - #u Phe Cys Lys Leu Ser          #       845                                                                    - Asp Phe Ala Ala Ala Leu Asp Pro Pro Leu Le - #u Ile Ala Lys Pro Asn          #   860                                                                        - Lys Val Gln Leu Ile Ala Met Asp Leu Pro Me - #t Val Ser Gly Asp Arg          865                 8 - #70                 8 - #75                 8 -        #80                                                                            - Ile His Cys Leu Asp Ile Leu Phe Ala Phe Th - #r Lys Arg Val Leu Gly          #               895                                                            - Glu Gly Gly Glu Met Asp Ser Leu Arg Ser Gl - #n Met Glu Glu Arg Phe          #           910                                                                - Met Ser Ala Asn Pro Ser Lys Val Ser Tyr Gl - #u Pro Ile Thr Thr Thr          #       925                                                                    - Leu Lys Arg Lys Gln Glu Glu Val Ser Ala Th - #r Ile Ile Gln Arg Ala          #   940                                                                        - Tyr Arg Arg Tyr Arg Leu Arg Gln His Val Ly - #s Asn Ile Ser Ser Ile          945                 9 - #50                 9 - #55                 9 -        #60                                                                            - Tyr Ile Lys Asp Gly Asp Arg Asp Asp Asp Le - #u Pro Asn Lys Glu Asp          #               975                                                            - Thr Val Phe Asp Asn Val Asn Glu Asn Ser Se - #r Pro Glu Lys Thr Asp          #           990                                                                - Val Thr Ala Ser Thr Ile Ser Pro Pro Ser Ty - #r Asp Ser Val Thr Lys          #      10050                                                                   - Pro Asp Gln                                                                      1010                                                                       - (2) INFORMATION FOR SEQ ID NO:3:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 29 base                                                            (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: cDNA                                                 -     (ix) FEATURE:                                                                      (A) NAME/KEY: misc.sub.-- - #feature                                           (B) LOCATION: 12                                                     #/note= "Base is Inosine"MATION:                                               -     (ix) FEATURE:                                                                      (A) NAME/KEY: misc.sub.-- - #feature                                           (B) LOCATION: 15                                                     #/note= "Base is Inosine"MATION:                                               -     (ix) FEATURE:                                                                      (A) NAME/KEY: misc.sub.-- - #feature                                           (B) LOCATION: 19                                                     #/note= "Base is Inosine"MATION:                                               -     (ix) FEATURE:                                                                      (A) NAME/KEY: misc.sub.-- - #feature                                           (B) LOCATION: 21                                                     #/note= "Base is Inosine"MATION:                                               #ID NO:3: (xi) SEQUENCE DESCRIPTION: SEQ                                       #            29    TYNN NATHATGGG                                              - (2) INFORMATION FOR SEQ ID NO:4:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 8 amino                                                            (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                  - Phe Trp Leu Ile Phe Ser Ile Met                                              1               5                                                              - (2) INFORMATION FOR SEQ ID NO:5:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 34 base                                                            (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: cDNA                                                 -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:                                  #        34        RTTR TCDATDATNA CNCC                                        - (2) INFORMATION FOR SEQ ID NO:6:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 8 amino                                                            (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                  - Gly Val Ile Ile Asp Asn Phe Asn                                              1               5                                                              - (2) INFORMATION FOR SEQ ID NO:7:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 2005 amino                                                         (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                  - Met Ala Arg Ser Val Leu Val Pro Pro Gly Pr - #o Asp Ser Phe Arg Phe          #                15                                                            - Phe Thr Arg Glu Ser Leu Ala Ala Ile Glu Gl - #n Arg Ile Ala Glu Glu          #            30                                                                - Lys Ala Lys Arg Pro Lys Gln Glu Arg Lys As - #p Glu Asp Asp Glu Asn          #        45                                                                    - Gly Pro Lys Pro Asn Ser Asp Leu Glu Ala Gl - #y Lys Ser Leu Pro Phe          #    60                                                                        - Ile Tyr Gly Asp Ile Pro Pro Glu Met Val Se - #r Glu Pro Leu Glu Asp          #80                                                                            - Leu Asp Pro Tyr Tyr Ile Asn Lys Lys Thr Ph - #e Ile Val Leu Asn Lys          #                95                                                            - Gly Lys Ala Ile Ser Arg Phe Ser Ala Thr Se - #r Ala Leu Tyr Ile Leu          #           110                                                                - Thr Pro Phe Asn Pro Ile Arg Lys Leu Ala Il - #e Lys Ile Leu Val His          #       125                                                                    - Ser Leu Phe Asn Val Leu Ile Met Cys Thr Il - #e Leu Thr Asn Cys Val          #   140                                                                        - Phe Met Thr Met Ser Asn Pro Pro Asp Trp Th - #r Lys Asn Val Glu Tyr          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Thr Phe Thr Gly Ile Tyr Thr Phe Glu Ser Le - #u Ile Lys Ile Leu Ala          #               175                                                            - Arg Gly Phe Cys Leu Glu Asp Phe Thr Phe Le - #u Arg Asn Pro Trp Asn          #           190                                                                - Trp Leu Asp Phe Thr Val Ile Thr Phe Ala Ty - #r Val Thr Glu Phe Val          #       205                                                                    - Asn Leu Gly Asn Val Ser Ala Leu Arg Thr Ph - #e Arg Val Leu Arg Ala          #   220                                                                        - Leu Lys Thr Ile Ser Val Ile Pro Gly Leu Ly - #s Thr Ile Val Gly Ala          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Leu Ile Gln Ser Val Lys Lys Leu Ser Asp Va - #l Met Ile Leu Thr Val          #               255                                                            - Phe Cys Leu Ser Val Phe Ala Leu Ile Gly Le - #u Gln Leu Phe Met Gly          #           270                                                                - Asn Leu Arg Asn Lys Cys Leu Gln Trp Pro Pr - #o Asp Asn Ser Thr Phe          #       285                                                                    - Glu Ile Asn Ile Thr Ser Phe Phe Asn Asn Se - #r Leu Asp Trp Asn Gly          #   300                                                                        - Thr Ala Phe Asn Arg Thr Val Asn Met Phe As - #n Trp Asp Glu Tyr Ile          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Glu Asp Lys Ser His Phe Tyr Phe Leu Glu Gl - #y Gln Asn Asp Ala Leu          #               335                                                            - Leu Cys Gly Asn Ser Ser Asp Ala Gly Gln Cy - #s Pro Glu Gly Tyr Ile          #           350                                                                - Cys Val Lys Ala Gly Arg Asn Pro Asn Tyr Gl - #y Tyr Thr Ser Phe Asp          #       365                                                                    - Thr Phe Ser Trp Ala Phe Leu Ser Leu Phe Ar - #g Leu Met Thr Gln Asp          #   380                                                                        - Phe Trp Glu Asn Leu Tyr Gln Leu Thr Leu Ar - #g Ala Ala Gly Lys Thr          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Tyr Met Ile Phe Phe Val Leu Val Ile Phe Le - #u Gly Ser Phe Tyr Leu          #               415                                                            - Ile Asn Leu Ile Leu Ala Val Val Ala Met Al - #a Tyr Glu Glu Gln Asn          #           430                                                                - Gln Ala Thr Leu Glu Glu Ala Glu Gln Lys Gl - #u Ala Glu Phe Gln Gln          #       445                                                                    - Met Leu Glu Gln Leu Lys Lys Gln Gln Glu Gl - #u Ala Gln Ala Ala Ala          #   460                                                                        - Ala Ala Ala Ser Ala Glu Ser Arg Asp Phe Se - #r Gly Ala Gly Gly Ile          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Gly Val Phe Ser Glu Ser Ser Ser Val Ala Se - #r Lys Leu Ser Ser Lys          #               495                                                            - Ser Glu Lys Glu Leu Lys Asn Arg Arg Lys Ly - #s Lys Lys Gln Lys Glu          #           510                                                                - Gln Ala Gly Glu Glu Glu Lys Glu Asp Ala Va - #l Arg Lys Ser Ala Ser          #       525                                                                    - Glu Asp Ser Ile Arg Lys Lys Gly Phe Gln Ph - #e Ser Leu Glu Gly Ser          #   540                                                                        - Arg Leu Thr Tyr Glu Lys Arg Phe Ser Ser Pr - #o His Gln Ser Leu Leu          545                 5 - #50                 5 - #55                 5 -        #60                                                                            - Ser Ile Arg Gly Ser Leu Phe Ser Pro Arg Ar - #g Asn Ser Arg Ala Ser          #               575                                                            - Leu Phe Asn Phe Lys Gly Arg Val Lys Asp Il - #e Gly Ser Glu Asn Asp          #           590                                                                - Phe Ala Asp Asp Glu His Ser Thr Phe Glu As - #p Asn Asp Ser Arg Arg          #       605                                                                    - Asp Ser Leu Phe Val Pro His Arg His Gly Gl - #u Arg Arg Pro Ser Asn          #   620                                                                        - Val Ser Gln Ala Ser Arg Ala Ser Arg Gly Il - #e Pro Thr Leu Pro Met          625                 6 - #30                 6 - #35                 6 -        #40                                                                            - Asn Gly Lys Met His Ser Ala Val Asp Cys As - #n Gly Val Val Ser Leu          #               655                                                            - Val Gly Gly Pro Ser Ala Leu Thr Ser Pro Va - #l Gly Gln Leu Leu Pro          #           670                                                                - Glu Gly Thr Thr Thr Glu Thr Glu Ile Arg Ly - #s Arg Arg Ser Ser Ser          #       685                                                                    - Tyr His Val Ser Met Asp Leu Leu Glu Asp Pr - #o Ser Arg Gln Arg Ala          #   700                                                                        - Met Ser Met Ala Ser Ile Leu Thr Asn Thr Me - #t Glu Glu Leu Glu Glu          705                 7 - #10                 7 - #15                 7 -        #20                                                                            - Ser Arg Gln Lys Cys Pro Pro Cys Trp Tyr Ly - #s Phe Ala Asn Met Cys          #               735                                                            - Leu Ile Trp Asp Cys Cys Lys Pro Trp Leu Ly - #s Val Lys His Val Val          #           750                                                                - Asn Leu Val Val Met Asp Pro Phe Val Asp Le - #u Ala Ile Thr Ile Cys          #       765                                                                    - Ile Val Leu Asn Thr Leu Phe Met Ala Met Gl - #u His Tyr Pro Met Thr          #   780                                                                        - Glu Gln Phe Ser Ser Val Leu Ser Val Gly As - #n Leu Val Phe Thr Gly          785                 7 - #90                 7 - #95                 8 -        #00                                                                            - Ile Phe Thr Ala Glu Met Phe Leu Lys Ile Il - #e Ala Met Asp Pro Tyr          #               815                                                            - Tyr Tyr Phe Gln Glu Gly Trp Asn Ile Phe As - #p Gly Phe Ile Val Ser          #           830                                                                - Leu Ser Leu Met Glu Leu Gly Leu Ala Asn Va - #l Glu Gly Leu Ser Val          #       845                                                                    - Leu Arg Ser Phe Arg Leu Leu Arg Val Phe Ly - #s Leu Ala Lys Ser Trp          #   860                                                                        - Pro Thr Leu Asn Met Leu Ile Lys Ile Ile Gl - #y Asn Ser Val Gly Ala          865                 8 - #70                 8 - #75                 8 -        #80                                                                            - Leu Gly Asn Leu Thr Leu Val Leu Ala Ile Il - #e Val Phe Ile Phe Ala          #               895                                                            - Val Val Gly Met Gln Leu Phe Gly Lys Ser Ty - #r Lys Glu Cys Val Cys          #           910                                                                - Lys Ile Ser Asn Asp Cys Glu Leu Pro Arg Tr - #p His Met His His Phe          #       925                                                                    - Phe His Ser Phe Leu Ile Val Phe Arg Val Le - #u Cys Gly Glu Trp Ile          #   940                                                                        - Glu Thr Met Trp Asp Cys Met Glu Val Ala Gl - #y Gln Thr Met Cys Leu          945                 9 - #50                 9 - #55                 9 -        #60                                                                            - Thr Val Phe Met Met Val Met Val Ile Gly As - #n Leu Val Val Leu Asn          #               975                                                            - Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe Se - #r Ser Asp Asn Leu Ala          #           990                                                                - Ala Thr Asp Asp Asp Asn Glu Met Asn Asn Le - #u Gln Ile Ala Val Gly          #      10050                                                                   - Arg Met Gln Lys Gly Ile Asp Phe Val Lys Ar - #g Lys Ile Arg Glu Phe          #  10205                                                                       - Ile Gln Lys Ala Phe Val Arg Lys Gln Lys Al - #a Leu Asp Glu Ile Lys          #               10401030 - #                1035                               - Pro Leu Glu Asp Leu Asn Asn Lys Lys Asp Se - #r Cys Ile Ser Asn His          #              10550                                                           - Thr Thr Ile Glu Ile Gly Lys Asp Leu Asn Ty - #r Leu Lys Asp Gly Asn          #          10705                                                               - Gly Thr Thr Ser Gly Ile Gly Ser Ser Val Gl - #u Lys Tyr Val Val Asp          #      10850                                                                   - Glu Ser Asp Tyr Met Ser Phe Ile Asn Asn Pr - #o Ser Leu Thr Val Thr          #  11005                                                                       - Val Pro Ile Ala Leu Gly Glu Ser Asp Phe Gl - #u Asn Leu Asn Thr Glu          #               11201110 - #                1115                               - Glu Phe Ser Ser Glu Ser Asp Met Glu Glu Se - #r Lys Glu Lys Leu Asn          #              11350                                                           - Ala Thr Ser Ser Ser Glu Gly Ser Thr Val As - #p Ile Gly Ala Pro Ala          #          11505                                                               - Glu Gly Glu Gln Pro Glu Ala Glu Pro Glu Gl - #u Ser Leu Glu Pro Glu          #      11650                                                                   - Ala Cys Phe Thr Glu Asp Cys Val Arg Lys Ph - #e Lys Cys Cys Gln Ile          #  11805                                                                       - Ser Ile Glu Glu Gly Lys Gly Lys Leu Trp Tr - #p Asn Leu Arg Lys Thr          #               12001190 - #                1195                               - Cys Tyr Lys Ile Val Glu His Asn Trp Phe Gl - #u Ile Phe Ile Val Phe          #              12150                                                           - Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Ph - #e Glu Asp Ile Tyr Ile          #          12305                                                               - Glu Gln Arg Lys Thr Ile Lys Thr Met Leu Gl - #u Tyr Ala Asp Lys Val          #      12450                                                                   - Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Le - #u Lys Trp Val Ala Tyr          #  12605                                                                       - Gly Phe Gln Met Tyr Phe Thr Asn Ala Trp Cy - #s Trp Leu Asp Phe Leu          #               12801270 - #                1275                               - Ile Val Asp Val Ser Leu Val Ser Leu Thr Al - #a Asn Ala Leu Gly Tyr          #              12950                                                           - Ser Glu Leu Gly Ala Ile Lys Ser Leu Arg Th - #r Leu Arg Ala Leu Arg          #          13105                                                               - Pro Leu Arg Ala Leu Ser Arg Phe Glu Gly Me - #t Arg Val Val Val Asn          #      13250                                                                   - Ala Leu Leu Gly Ala Ile Pro Ser Ile Met As - #n Val Leu Leu Val Cys          #  13405                                                                       - Leu Ile Phe Trp Leu Ile Phe Ser Ile Met Gl - #y Val Asn Leu Phe Ala          #               13601350 - #                1355                               - Gly Lys Phe Tyr His Cys Ile Asn Tyr Thr Il - #e Gly Glu Met Phe Asp          #              13750                                                           - Val Ser Val Val Asn Asn Tyr Ser Glu Cys Gl - #n Ala Leu Ile Glu Ser          #          13905                                                               - Asn Gln Thr Ala Arg Trp Lys Asn Val Lys Va - #l Asn Phe Asp Asn Val          #      14050                                                                   - Gly Leu Gly Tyr Leu Ser Leu Leu Gln Val Al - #a Thr Phe Lys Gly Trp          #  14205                                                                       - Met Asp Ile Met Tyr Ala Ala Val Asp Ser Ar - #g Asn Val Glu Leu Gln          #               14401430 - #                1435                               - Pro Lys Tyr Glu Asp Asn Leu Tyr Met Tyr Le - #u Tyr Phe Val Ile Phe          #              14550                                                           - Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Le - #u Phe Ile Gly Val Ile          #          14705                                                               - Ile Asp Asn Phe Asn Gln Gln Lys Lys Lys Ph - #e Gly Gly Gln Asp Ile          #      14850                                                                   - Phe Met Thr Glu Glu Gln Lys Lys Tyr Tyr As - #n Ala Met Lys Lys Leu          #  15005                                                                       - Gly Ser Lys Lys Pro Gln Lys Pro Ile Pro Ar - #g Pro Ala Asn Lys Phe          #               15201510 - #                1515                               - Gln Gly Met Val Phe Asp Phe Val Thr Lys Gl - #n Val Phe Asp Ile Ser          #              15350                                                           - Ile Met Ile Leu Ile Cys Leu Asn Met Val Th - #r Met Met Val Glu Thr          #          15505                                                               - Asp Asp Gln Ser Gln Glu Met Thr Asn Ile Le - #u Tyr Trp Ile Asn Leu          #      15650                                                                   - Val Phe Ile Val Leu Phe Thr Gly Glu Cys Va - #l Leu Lys Leu Ile Ser          #  15805                                                                       - Leu Arg His Tyr Tyr Phe Thr Ile Gly Trp As - #n Ile Phe Asp Phe Val          #               16001590 - #                1595                               - Val Val Ile Leu Ser Ile Val Gly Met Phe Le - #u Ala Glu Leu Ile Glu          #              16150                                                           - Lys Tyr Phe Val Ser Pro Thr Leu Phe Arg Va - #l Ile Arg Leu Ala Arg          #          16305                                                               - Ile Gly Arg Ile Leu Arg Leu Ile Lys Gly Al - #a Lys Gly Ile Arg Thr          #      16450                                                                   - Leu Leu Phe Ala Leu Met Met Ser Leu Pro Al - #a Leu Phe Asn Ile Gly          #  16605                                                                       - Leu Leu Leu Phe Leu Val Met Phe Ile Tyr Al - #a Ile Phe Gly Met Ser          #               16801670 - #                1675                               - Asn Phe Ala Tyr Val Lys Arg Glu Val Gly Il - #e Asp Asp Met Phe Asn          #              16950                                                           - Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Le - #u Phe Gln Ile Thr Thr          #          17105                                                               - Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Il - #e Leu Asn Ser Gly Pro          #      17250                                                                   - Pro Asp Cys Asp Pro Glu Lys Asp His Pro Gl - #y Ser Ser Val Lys Gly          #  17405                                                                       - Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Ph - #e Phe Val Ser Tyr Ile          #               17601750 - #                1755                               - Ile Ile Ser Phe Leu Val Val Val Asn Met Ty - #r Ile Ala Val Ile Leu          #              17750                                                           - Glu Asn Phe Ser Val Ala Thr Glu Glu Ser Al - #a Glu Pro Leu Ser Glu          #          17905                                                               - Asp Asp Phe Glu Met Phe Tyr Glu Val Trp Gl - #u Lys Phe Asp Pro Asp          #      18050                                                                   - Ala Thr Gln Phe Ile Glu Phe Cys Lys Leu Se - #r Asp Phe Ala Ala Ala          #  18205                                                                       - Leu Asp Pro Pro Leu Leu Ile Ala Lys Pro As - #n Lys Val Gln Leu Ile          #               18401830 - #                1835                               - Ala Met Asp Leu Pro Met Val Ser Gly Asp Ar - #g Ile His Cys Leu Asp          #              18550                                                           - Ile Leu Phe Ala Phe Thr Lys Arg Val Leu Gl - #y Glu Ser Gly Glu Met          #          18705                                                               - Asp Ala Leu Arg Ile Gln Met Glu Glu Arg Ph - #e Met Ala Ser Asn Pro          #      18850                                                                   - Ser Lys Val Ser Tyr Glu Pro Ile Thr Thr Th - #r Leu Lys Arg Lys Gln          #  19005                                                                       - Glu Glu Val Ser Ala Ile Val Ile Gln Arg Al - #a Tyr Arg Arg Tyr Leu          #               19201910 - #                1915                               - Leu Lys Gln Lys Val Lys Lys Val Ser Ser Il - #e Tyr Lys Lys Asp Lys          #              19350                                                           - Gly Lys Glu Asp Glu Gly Thr Pro Ile Lys Gl - #u Asp Ile Ile Thr Asp          #          19505                                                               - Lys Leu Asn Glu Asn Ser Thr Pro Glu Lys Th - #r Asp Val Thr Pro Ser          #      19650                                                                   - Thr Thr Ser Pro Pro Ser Tyr Asp Ser Val Th - #r Lys Pro Glu Lys Glu          #  19805                                                                       - Lys Phe Glu Lys Asp Lys Ser Glu Lys Glu As - #p Lys Gly Lys Asp Ile          #               20001990 - #                1995                               - Arg Glu Ser Lys Lys                                                                          2005                                                           - (2) INFORMATION FOR SEQ ID NO:8:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 813 amino                                                          (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: peptide                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:                                  - Asn Leu Val Val Leu Asn Leu Phe Leu Ala Le - #u Leu Leu Ser Ser Phe          #                15                                                            - Ser Ser Asp Asn Leu Ala Asp Asn Asn Leu Gl - #n Ile Ala Val Arg Gly          #            30                                                                - Ile Val Lys Arg Glu Phe Ile Lys Phe Lys Ly - #s Lys Asp Asn Asn Lys          #        45                                                                    - Lys Ile Ser Asn Thr Glu Lys Asp Asn Leu Ly - #s Ser Gly Gly Ser Ser          #    60                                                                        - Lys Asp Glu Asp Tyr Ser Phe Ile Asn Pro Se - #r Leu Thr Val Thr Val          #80                                                                            - Pro Ile Ala Gly Glu Ser Asp Glu Asn Thr Gl - #u Glu Ser Ser Ser Asp          #                95                                                            - Ser Lys Glu Lys Asn Ser Ser Ser Glu Ser Th - #r Val Asp Pro Glu Glu          #           110                                                                - Glu Ala Glu Pro Glu Pro Glu Ala Cys Phe Th - #r Cys Val Arg Phe Cys          #       125                                                                    - Cys Gln Gly Lys Gly Lys Trp Trp Arg Lys Th - #r Cys Tyr Ile Val Glu          #   140                                                                        - His Trp Phe Glu Phe Ile Val Met Ile Leu Le - #u Ser Ser Gly Ala Leu          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Ala Phe Glu Asp Ile Tyr Ile Glu Lys Thr Il - #e Lys Leu Glu Tyr Ala          #               175                                                            - Asp Lys Phe Thr Tyr Ile Phe Ile Leu Glu Me - #t Leu Leu Lys Trp Val          #           190                                                                - Ala Tyr Gly Tyr Phe Thr Asn Ala Trp Cys Tr - #p Leu Asp Phe Leu Ile          #       205                                                                    - Val Asp Val Ser Leu Val Leu Ala Asn Leu Gl - #y Tyr Ser Leu Gly Ile          #   220                                                                        - Lys Ser Leu Arg Thr Leu Arg Ala Leu Arg Pr - #o Leu Arg Ala Leu Ser          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Arg Phe Glu Gly Met Arg Val Val Val Asn Al - #a Leu Gly Ala Ile Pro          #               255                                                            - Ser Ile Met Asn Val Leu Leu Val Cys Leu Il - #e Phe Trp Leu Ile Phe          #           270                                                                - Ser Ile Met Gly Val Asn Leu Phe Ala Gly Ly - #s Phe Tyr Cys Asn Thr          #       285                                                                    - Gly Phe Ser Val Asn Ser Glu Cys Ala Leu Ar - #g Trp Lys Asn Lys Val          #   300                                                                        - Asn Phe Asp Asn Val Gly Leu Gly Tyr Leu Se - #r Leu Leu Gln Val Ala          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Thr Phe Lys Gly Trp Met Asp Ile Met Tyr Al - #a Ala Val Asp Ser Asn          #               335                                                            - Val Gln Pro Lys Tyr Glu Leu Tyr Met Tyr Ty - #r Phe Val Ile Phe Ile          #           350                                                                - Ile Phe Gly Ser Phe Phe Thr Leu Asn Leu Ph - #e Ile Gly Val Ile Ile          #       365                                                                    - Asp Asn Phe Asn Gln Gln Lys Lys Lys Gly Gl - #y Gln Asp Ile Phe Met          #   380                                                                        - Thr Glu Glu Gln Lys Lys Tyr Tyr Asn Ala Me - #t Lys Lys Leu Gly Ser          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Lys Lys Pro Gln Lys Pro Ile Pro Arg Pro As - #n Lys Phe Gln Gly Phe          #               415                                                            - Asp Val Thr Gln Phe Asp Ile Ile Met Leu Il - #e Cys Leu Asn Met Val          #           430                                                                - Thr Met Met Val Glu Gln Met Leu Trp Ile As - #n Val Phe Ile Leu Phe          #       445                                                                    - Thr Gly Glu Cys Val Leu Lys Leu Ile Ser Le - #u Arg His Tyr Tyr Phe          #   460                                                                        - Thr Gly Trp Asn Ile Phe Val Val Val Ile Le - #u Ser Ile Val Gly Met          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Phe Leu Ala Glu Ile Glu Lys Tyr Phe Val Se - #r Pro Thr Leu Phe Arg          #               495                                                            - Val Ile Arg Leu Ala Arg Ile Gly Arg Ile Le - #u Arg Leu Ile Lys Gly          #           510                                                                - Ala Lys Gly Ile Arg Thr Leu Leu Phe Ala Le - #u Met Met Ser Leu Pro          #       525                                                                    - Ala Leu Phe Asn Ile Gly Leu Leu Leu Phe Le - #u Val Met Phe Ile Tyr          #   540                                                                        - Ala Ile Phe Gly Met Ser Asn Phe Ala Tyr Va - #l Lys Glu Gly Ile Asp          545                 5 - #50                 5 - #55                 5 -        #60                                                                            - Met Phe Asn Phe Glu Thr Phe Gly Asn Ser Me - #t Ile Cys Leu Phe Gln          #               575                                                            - Ile Thr Thr Ser Ala Gly Trp Asp Gly Leu Le - #u Ala Pro Ile Leu Asn          #           590                                                                - Ser Pro Pro Asp Cys Asp Pro Lys His Pro Gl - #y Ser Ser Val Gly Asp          #       605                                                                    - Cys Gly Asn Pro Ser Val Gly Ile Phe Phe Va - #l Ser Tyr Ile Ile Ile          #   620                                                                        - Ser Phe Leu Val Val Val Asn Met Tyr Ile Al - #a Val Ile Leu Glu Asn          625                 6 - #30                 6 - #35                 6 -        #40                                                                            - Phe Ser Val Ala Thr Glu Glu Ser Glu Pro Le - #u Ser Glu Asp Asp Phe          #               655                                                            - Glu Met Phe Tyr Glu Val Trp Glu Lys Phe As - #p Pro Asp Ala Thr Gln          #           670                                                                - Phe Ile Glu Phe Cys Lys Leu Ser Asp Phe Al - #a Ala Ala Leu Asp Pro          #       685                                                                    - Pro Leu Leu Ile Ala Lys Pro Asn Lys Val Gl - #n Leu Ile Ala Met Asp          #   700                                                                        - Leu Pro Met Val Ser Gly Asp Arg Ile His Cy - #s Leu Asp Ile Leu Phe          705                 7 - #10                 7 - #15                 7 -        #20                                                                            - Ala Phe Thr Lys Arg Val Leu Gly Glu Gly Gl - #u Met Asp Leu Arg Gln          #               735                                                            - Met Glu Glu Arg Phe Met Asn Pro Ser Lys Va - #l Ser Tyr Glu Pro Ile          #           750                                                                - Thr Thr Thr Leu Lys Arg Lys Gln Glu Glu Va - #l Ser Ala Ile Gln Arg          #       765                                                                    - Ala Tyr Arg Arg Tyr Leu Gln Val Lys Ser Se - #r Ile Tyr Lys Asp Asp          #   780                                                                        - Pro Lys Glu Asp Asp Asn Glu Asn Ser Pro Gl - #u Lys Thr Asp Val Thr          785                 7 - #90                 7 - #95                 8 -        #00                                                                            - Ser Thr Ser Pro Pro Ser Tyr Asp Ser Val Th - #r Lys Pro                      #               810                                                            - (2) INFORMATION FOR SEQ ID NO:9:                                             -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 6452 base                                                          (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: both                                                   -     (ii) MOLECULE TYPE: DNA (genomic)                                        -     (ix) FEATURE:                                                                      (A) NAME/KEY: CDS                                                              (B) LOCATION: 326..6277                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:                                  - GTCGCCTCAT CCTGAGCAGA CTGGAAACAG ACTCCGTGCA GGCCTCGCCC GC - #GCTCCAGT          60                                                                           - TGCGACTGTA GGGTTTTCAT TCCTGCCCAC TGCGCAGACT GGGCTGAGCT AG - #CCTGGGTA         120                                                                           - TCCACGATTC GCGACTCGTA GTAACAGGCA CTCTGAGCAA CAGGATTTCA GA - #GAAAGAAG         180                                                                           - CAGAGGCAAG AAAGAAGCCT GGGGAGAGAG GAAGACTTTC CTTGGATCAG AC - #TCCGCAGG         240                                                                           - TGCACACACC GGGTGGGCAT GATCCGTGGG GCCAGGCCTC TTAGGTAAGG AG - #TCAAAGGG         300                                                                           #CCT CCA GGA CCT      352AAAG ATG GCG ATG CTG CCT                              #          Met Ala Met Leu Pro - # Pro Pro Gly Pro                             #   10205                                                                      - CAG AGT TTC GTT CAC TTC ACA AAA CAG TCC CT - #T GCC CTC ATT GAA CAG           400                                                                           Gln Ser Phe Val His Phe Thr Lys Gln Ser Le - #u Ala Leu Ile Glu Gln            #              10350                                                           - CGT ATT TCT GAA GAA AAA GCC AAG GAA CAC AA - #A GAC GAA AAG AAA GAT           448                                                                           Arg Ile Ser Glu Glu Lys Ala Lys Glu His Ly - #s Asp Glu Lys Lys Asp            #          10505                                                               - GAT GAG GAA GAA GGC CCC AAG CCC AGC AGT GA - #C TTG GAA GCT GGG AAA           496                                                                           Asp Glu Glu Glu Gly Pro Lys Pro Ser Ser As - #p Leu Glu Ala Gly Lys            #      10650                                                                   - CAG CTC CCC TTC ATC TAT GGA GAC ATT CCC CC - #T GGA ATG GTG TCA GAG           544                                                                           Gln Leu Pro Phe Ile Tyr Gly Asp Ile Pro Pr - #o Gly Met Val Ser Glu            #  10805                                                                       - CCC CTG GAG GAC CTG GAC CCA TAC TAT GCT GA - #C AAA AAA ACT TTT ATA           592                                                                           Pro Leu Glu Asp Leu Asp Pro Tyr Tyr Ala As - #p Lys Lys Thr Phe Ile            #               11001090 - #                1095                               - GTA TTG AAC AAA GGG AAA GCA ATC TTC CGT TT - #C AAC GCC ACC CCT GCT           640                                                                           Val Leu Asn Lys Gly Lys Ala Ile Phe Arg Ph - #e Asn Ala Thr Pro Ala            #              11150                                                           - TTG TAC ATG CTG TCT CCC TTC AGT CCT CTA AG - #A AGA ATA TCT ATT AAG           688                                                                           Leu Tyr Met Leu Ser Pro Phe Ser Pro Leu Ar - #g Arg Ile Ser Ile Lys            #          11305                                                               - ATC TTA GTG CAC TCC TTA TTC AGC ATG CTA AT - #C ATG TGC ACA ATT CTG           736                                                                           Ile Leu Val His Ser Leu Phe Ser Met Leu Il - #e Met Cys Thr Ile Leu            #      11450                                                                   - ACG AAC TGC ATA TTC ATG ACC TTG AGC AAC CC - #T CCA GAA TGG ACC AAA           784                                                                           Thr Asn Cys Ile Phe Met Thr Leu Ser Asn Pr - #o Pro Glu Trp Thr Lys            #  11605                                                                       - AAT GTA GAG TAC ACT TTT ACT GGG ATA TAT AC - #T TTT GAA TCA CTC ATA           832                                                                           Asn Val Glu Tyr Thr Phe Thr Gly Ile Tyr Th - #r Phe Glu Ser Leu Ile            #               11801170 - #                1175                               - AAA ATC CTT GCA AGA GGC TTT TGC GTG GGA GA - #A TTC ACC TTC CTC CGT           880                                                                           Lys Ile Leu Ala Arg Gly Phe Cys Val Gly Gl - #u Phe Thr Phe Leu Arg            #              11950                                                           - GAC CCT TGG AAC TGG CTG GAC TTT GTT GTC AT - #T GTT TTT GCG TAT TTA           928                                                                           Asp Pro Trp Asn Trp Leu Asp Phe Val Val Il - #e Val Phe Ala Tyr Leu            #          12105                                                               - ACA GAA TTT GTA AAC CTA GGC AAT GTT TCA GC - #T CTT CGA ACT TTC AGA           976                                                                           Thr Glu Phe Val Asn Leu Gly Asn Val Ser Al - #a Leu Arg Thr Phe Arg            #      12250                                                                   - GTC TTG AGA GCT TTG AAA ACT ATT TCT GTA AT - #C CCA GGA CTA AAG ACC          1024                                                                           Val Leu Arg Ala Leu Lys Thr Ile Ser Val Il - #e Pro Gly Leu Lys Thr            #  12405                                                                       - ATC GTG GGG GCC CTG ATC CAG TCA GTG AAG AA - #G CTC TCT GAC GTC ATG          1072                                                                           Ile Val Gly Ala Leu Ile Gln Ser Val Lys Ly - #s Leu Ser Asp Val Met            #               12601250 - #                1255                               - ATC CTC ACT GTG TTC TGT CTC AGT GTG TTT GC - #A CTA ATT GGA CTA CAG          1120                                                                           Ile Leu Thr Val Phe Cys Leu Ser Val Phe Al - #a Leu Ile Gly Leu Gln            #              12750                                                           - CTG TTT ATG GGC AAC TTG AAG CAT AAA TGT TT - #C AGG AAG GAA CTC GAA          1168                                                                           Leu Phe Met Gly Asn Leu Lys His Lys Cys Ph - #e Arg Lys Glu Leu Glu            #          12905                                                               - GAG AAT GAA ACA TTA GAA AGT ATC ATG AAT AC - #T GCT GAG AGT GAA GAA          1216                                                                           Glu Asn Glu Thr Leu Glu Ser Ile Met Asn Th - #r Ala Glu Ser Glu Glu            #      13050                                                                   - GAA TTG AAA AAA TAT TTT TAT TAC TTG GAG GG - #A TCC AAA GAT GCT CTA          1264                                                                           Glu Leu Lys Lys Tyr Phe Tyr Tyr Leu Glu Gl - #y Ser Lys Asp Ala Leu            #  13205                                                                       - CTC TGC GGC TTC AGC ACA GAT TCA GGG CAG TG - #T CCA GAA GGC TAC ATC          1312                                                                           Leu Cys Gly Phe Ser Thr Asp Ser Gly Gln Cy - #s Pro Glu Gly Tyr Ile            #               13401330 - #                1335                               - TGT GTG AAG GCT GGC AGA AAC CCG GAT TAT GG - #C TAC ACG AGC TTT GAC          1360                                                                           Cys Val Lys Ala Gly Arg Asn Pro Asp Tyr Gl - #y Tyr Thr Ser Phe Asp            #              13550                                                           - ACA TTC AGC TGG GCC TTC TTG GCC TTG TTT CG - #G CTA ATG ACT CAG GAC          1408                                                                           Thr Phe Ser Trp Ala Phe Leu Ala Leu Phe Ar - #g Leu Met Thr Gln Asp            #          13705                                                               - TAC TGG GAG AAC CTT TAC CAA CAG ACT CTG CG - #T GCT GCT GGC AAA ACC          1456                                                                           Tyr Trp Glu Asn Leu Tyr Gln Gln Thr Leu Ar - #g Ala Ala Gly Lys Thr            #      13850                                                                   - TAC ATG ATT TTC TTT GTC GTG GTT ATT TTT CT - #G GGC TCC TTT TAC CTG          1504                                                                           Tyr Met Ile Phe Phe Val Val Val Ile Phe Le - #u Gly Ser Phe Tyr Leu            #  14005                                                                       - ATA AAC TTG ATC CTG GCT GTG GTA GCC ATG GC - #G TAT GAG GAA CAG AAC          1552                                                                           Ile Asn Leu Ile Leu Ala Val Val Ala Met Al - #a Tyr Glu Glu Gln Asn            #               14201410 - #                1415                               - CAG GCC AAC ATC GAA GAA GCT AAA CAG AAA GA - #G TTA GAA TTT CAG CAG          1600                                                                           Gln Ala Asn Ile Glu Glu Ala Lys Gln Lys Gl - #u Leu Glu Phe Gln Gln            #              14350                                                           - ATG TTA GAC CGA CTC AAA AAG GAG CAG GAA GA - #A GCT GAG GCG ATC GCT          1648                                                                           Met Leu Asp Arg Leu Lys Lys Glu Gln Glu Gl - #u Ala Glu Ala Ile Ala            #          14505                                                               - GCA GCT GCT GCT GAG TTC ACG AGT ATA GGG CG - #G AGC AGG ATC ATG GGA          1696                                                                           Ala Ala Ala Ala Glu Phe Thr Ser Ile Gly Ar - #g Ser Arg Ile Met Gly            #      14650                                                                   - CTC TCT GAG AGC TCT TCA GAA ACC TCC AGG CT - #G AGC TCA AAG AGT GCC          1744                                                                           Leu Ser Glu Ser Ser Ser Glu Thr Ser Arg Le - #u Ser Ser Lys Ser Ala            #  14805                                                                       - AAG GAG AGA AGA AAC CGA AGA AAG AAA AAG AA - #A CAG AAG ATG TCC AGT          1792                                                                           Lys Glu Arg Arg Asn Arg Arg Lys Lys Lys Ly - #s Gln Lys Met Ser Ser            #               15001490 - #                1495                               - GGC GAG GAA AAG GGT GAC GAT GAG AAG CTG TC - #C AAG TCA GGA TCA GAG          1840                                                                           Gly Glu Glu Lys Gly Asp Asp Glu Lys Leu Se - #r Lys Ser Gly Ser Glu            #              15150                                                           - GAA AGC ATC CGA AAG AAA AGC TTC CAT CTC GG - #T GTG GAA GGG CAC CAC          1888                                                                           Glu Ser Ile Arg Lys Lys Ser Phe His Leu Gl - #y Val Glu Gly His His            #          15305                                                               - CGG ACC CGG GAA AAG AGG CTG TCC ACC CCC AA - #C CAG TCG CCA CTC AGC          1936                                                                           Arg Thr Arg Glu Lys Arg Leu Ser Thr Pro As - #n Gln Ser Pro Leu Ser            #      15450                                                                   - ATT CGC GGG TCC CTG TTT TCT GCC AGG CGC AG - #C AGC AGG ACG AGT CTC          1984                                                                           Ile Arg Gly Ser Leu Phe Ser Ala Arg Arg Se - #r Ser Arg Thr Ser Leu            #  15605                                                                       - TTC AGT TTT AAG GGG CGA GGA AGA GAT CTG GG - #A TCT GAG ACA GAA TTC          2032                                                                           Phe Ser Phe Lys Gly Arg Gly Arg Asp Leu Gl - #y Ser Glu Thr Glu Phe            #               15801570 - #                1575                               - GCT GAT GAT GAG CAT AGC ATT TTT GGA GAC AA - #C GAG AGC AGA AGG GGT          2080                                                                           Ala Asp Asp Glu His Ser Ile Phe Gly Asp As - #n Glu Ser Arg Arg Gly            #              15950                                                           - TCA CTA TTC GTA CCC CAT AGA CCC CGG GAG CG - #G CGC AGC AGT AAC ATC          2128                                                                           Ser Leu Phe Val Pro His Arg Pro Arg Glu Ar - #g Arg Ser Ser Asn Ile            #          16105                                                               - AGT CAG GCC AGT AGG TCC CCG CCA GTG CTA CC - #G GTG AAC GGG AAG ATG          2176                                                                           Ser Gln Ala Ser Arg Ser Pro Pro Val Leu Pr - #o Val Asn Gly Lys Met            #      16250                                                                   - CAC AGT GCA GTG GAC TGC AAT GGA GTC GTG TC - #G CTT GTT GAT GGA CCC          2224                                                                           His Ser Ala Val Asp Cys Asn Gly Val Val Se - #r Leu Val Asp Gly Pro            #  16405                                                                       - TCA GCC CTC ATG CTC CCC AAT GGA CAG CTT CT - #T CCA GAG GTG ATA ATA          2272                                                                           Ser Ala Leu Met Leu Pro Asn Gly Gln Leu Le - #u Pro Glu Val Ile Ile            #               16601650 - #                1655                               - GAT AAG GCA ACT TCC GAC GAC AGC GGC ACG AC - #T AAT CAG ATG CGC AAA          2320                                                                           Asp Lys Ala Thr Ser Asp Asp Ser Gly Thr Th - #r Asn Gln Met Arg Lys            #              16750                                                           - AAA AGG CTC TCT AGT TCT TAC TTC TTG TCT GA - #G GAC ATG CTG AAT GAC          2368                                                                           Lys Arg Leu Ser Ser Ser Tyr Phe Leu Ser Gl - #u Asp Met Leu Asn Asp            #          16905                                                               - CCG CAT CTC AGG CAA AGG GCC ATG AGC AGG GC - #G AGC ATA CTG ACC AAC          2416                                                                           Pro His Leu Arg Gln Arg Ala Met Ser Arg Al - #a Ser Ile Leu Thr Asn            #      17050                                                                   - ACT GTG GAA GAA CTT GAA GAA TCT AGA CAA AA - #A TGT CCA CCA TGG TGG          2464                                                                           Thr Val Glu Glu Leu Glu Glu Ser Arg Gln Ly - #s Cys Pro Pro Trp Trp            #  17205                                                                       - TAC AGA TTT GCT CAC ACA TTT TTA ATC TGG AA - #T TGC TCT CCA TAT TGG          2512                                                                           Tyr Arg Phe Ala His Thr Phe Leu Ile Trp As - #n Cys Ser Pro Tyr Trp            #               17401730 - #                1735                               - ATA AAA TTC AAA AAG CTC ATC TAT TTT ATT GT - #G ATG GAT CCT TTT GTA          2560                                                                           Ile Lys Phe Lys Lys Leu Ile Tyr Phe Ile Va - #l Met Asp Pro Phe Val            #              17550                                                           - GAT CTT GCA ATT ACC ATT TGC ATA GTT TTA AA - #C ACC TTA TTT ATG GCT          2608                                                                           Asp Leu Ala Ile Thr Ile Cys Ile Val Leu As - #n Thr Leu Phe Met Ala            #          17705                                                               - ATG GAG CAC CAC CCA ATG ACT GAA GAA TTC AA - #A AAT GTC CTT GCA GTG          2656                                                                           Met Glu His His Pro Met Thr Glu Glu Phe Ly - #s Asn Val Leu Ala Val            #      17850                                                                   - GGG AAC TTG ATC TTT ACA GGG ATC TTC GCA GC - #T GAA ATG GTA CTG AAG          2704                                                                           Gly Asn Leu Ile Phe Thr Gly Ile Phe Ala Al - #a Glu Met Val Leu Lys            #  18005                                                                       - TTA ATA GCC ATG GAC CCC TAT GAG TAT TTC CA - #A GTA GGG TGG AAT ATT          2752                                                                           Leu Ile Ala Met Asp Pro Tyr Glu Tyr Phe Gl - #n Val Gly Trp Asn Ile            #               18201810 - #                1815                               - TTT GAC AGC CTA ATT GTG ACG CTG AGT TTG AT - #A GAG CTT TTC CTA GCA          2800                                                                           Phe Asp Ser Leu Ile Val Thr Leu Ser Leu Il - #e Glu Leu Phe Leu Ala            #              18350                                                           - GAT GTG GAA GGA TTA TCA GTT CTG CGG TCA TT - #C AGA TTG CTC CGA GTC          2848                                                                           Asp Val Glu Gly Leu Ser Val Leu Arg Ser Ph - #e Arg Leu Leu Arg Val            #          18505                                                               - TTC AAG TTG GCA AAG TCC TGG CCC ACA CTG AA - #C ATG CTC ATT AAG ATC          2896                                                                           Phe Lys Leu Ala Lys Ser Trp Pro Thr Leu As - #n Met Leu Ile Lys Ile            #      18650                                                                   - ATC GGC AAC TCG GTG GGC GCA CTG GGC AAC CT - #G ACC CTG GTG CTG GCC          2944                                                                           Ile Gly Asn Ser Val Gly Ala Leu Gly Asn Le - #u Thr Leu Val Leu Ala            #  18805                                                                       - ATC ATC GTC TTC ATT TTT GCC GTG GTC GGC AT - #G CAG CTG TTT GGA AAG          2992                                                                           Ile Ile Val Phe Ile Phe Ala Val Val Gly Me - #t Gln Leu Phe Gly Lys            #               19001890 - #                1895                               - AGC TAC AAG GAG TGT GTC TGC AAG ATC AAT GT - #G GAC TGC AAG CTG CCG          3040                                                                           Ser Tyr Lys Glu Cys Val Cys Lys Ile Asn Va - #l Asp Cys Lys Leu Pro            #              19150                                                           - CGC TGG CAC ATG AAC GAC TTC TTC CAC TCC TT - #C CTC ATC GTG TTC CGA          3088                                                                           Arg Trp His Met Asn Asp Phe Phe His Ser Ph - #e Leu Ile Val Phe Arg            #          19305                                                               - GTG CTG TGT GGG GAG TGG ATA GAG ACC ATG TG - #G GAC TGC ATG GAG GTC          3136                                                                           Val Leu Cys Gly Glu Trp Ile Glu Thr Met Tr - #p Asp Cys Met Glu Val            #      19450                                                                   - GCG GGC CAG ACC ATG TGC CTT ATT GTT TAC AT - #G ATG GTC ATG GTG ATT          3184                                                                           Ala Gly Gln Thr Met Cys Leu Ile Val Tyr Me - #t Met Val Met Val Ile            #  19605                                                                       - GGG AAC CTT GTG GTC CTG AAC CTG TTT CTG GC - #T CTT TTG CTG AGT TCC          3232                                                                           Gly Asn Leu Val Val Leu Asn Leu Phe Leu Al - #a Leu Leu Leu Ser Ser            #               19801970 - #                1975                               - TTT AGT TCT GAC AAT CTT ACA GCA ATT GAG GA - #A GAC ACC GAT GCA AAC          3280                                                                           Phe Ser Ser Asp Asn Leu Thr Ala Ile Glu Gl - #u Asp Thr Asp Ala Asn            #              19950                                                           - AAC CTC CAG ATC GCA GTG GCC AGA ATT AAG AG - #G GGA ATC AAT TAC GTG          3328                                                                           Asn Leu Gln Ile Ala Val Ala Arg Ile Lys Ar - #g Gly Ile Asn Tyr Val            #          20105                                                               - AAA CAG ACC CTG CGT GAA TTC ATT CTA AAA TC - #A TTT TCC AAA AAG CCA          3376                                                                           Lys Gln Thr Leu Arg Glu Phe Ile Leu Lys Se - #r Phe Ser Lys Lys Pro            #      20250                                                                   - AAG GGC TCC AAG GAC ACA AAA CGA ACA GCA GA - #T CCC AAC AAC AAG AAA          3424                                                                           Lys Gly Ser Lys Asp Thr Lys Arg Thr Ala As - #p Pro Asn Asn Lys Lys            #  20405                                                                       - GAA AAC TAT ATT TCA AAC CGT ACC CTT GCG GA - #G ATG AGC AAG GAT CAC          3472                                                                           Glu Asn Tyr Ile Ser Asn Arg Thr Leu Ala Gl - #u Met Ser Lys Asp His            #               20602050 - #                2055                               - AAT TTC CTC AAA GAA AAG GAT AGG ATC AGT GG - #T TAT GGC AGC AGT CTA          3520                                                                           Asn Phe Leu Lys Glu Lys Asp Arg Ile Ser Gl - #y Tyr Gly Ser Ser Leu            #              20750                                                           - GAC AAA AGC TTT ATG GAT GAA AAT GAT TAC CA - #G TCC TTT ATC CAT AAC          3568                                                                           Asp Lys Ser Phe Met Asp Glu Asn Asp Tyr Gl - #n Ser Phe Ile His Asn            #          20905                                                               - CCC AGC CTC ACA GTG ACA GTG CCA ATT GCA CC - #T GGG GAG TCT GAT TTG          3616                                                                           Pro Ser Leu Thr Val Thr Val Pro Ile Ala Pr - #o Gly Glu Ser Asp Leu            #      21050                                                                   - GAG ATT ATG AAC ACA GAA GAG CTT AGC AGT GA - #C TCA GAC AGT GAC TAC          3664                                                                           Glu Ile Met Asn Thr Glu Glu Leu Ser Ser As - #p Ser Asp Ser Asp Tyr            #  21205                                                                       - AGC AAA GAG AAA CGG AAC CGA TCA AGC TCT TC - #T GAG TGC AGC ACT GTT          3712                                                                           Ser Lys Glu Lys Arg Asn Arg Ser Ser Ser Se - #r Glu Cys Ser Thr Val            #               21402130 - #                2135                               - GAC AAC CCT CTG CCA GGA GAA GAG GAG GCT GA - #A GCA GAG CCC GTA AAC          3760                                                                           Asp Asn Pro Leu Pro Gly Glu Glu Glu Ala Gl - #u Ala Glu Pro Val Asn            #              21550                                                           - GCA GAT GAG CCT GAA GCC TGC TTT ACA GAT GG - #T TGT GTG AGG AGA TTT          3808                                                                           Ala Asp Glu Pro Glu Ala Cys Phe Thr Asp Gl - #y Cys Val Arg Arg Phe            #          21705                                                               - CCA TGC TGC CAA GTT AAT GTA GAC TCT GGG AA - #A GGG AAA GTT TGG TGG          3856                                                                           Pro Cys Cys Gln Val Asn Val Asp Ser Gly Ly - #s Gly Lys Val Trp Trp            #      21850                                                                   - ACC ATC AGG AAG ACG TGC TAC AGG ATA GTT GA - #A CAC AGC TGG TTT GAA          3904                                                                           Thr Ile Arg Lys Thr Cys Tyr Arg Ile Val Gl - #u His Ser Trp Phe Glu            #  22005                                                                       - AGC TTC ATC GTT CTC ATG ATC CTG CTC AGC AG - #T GGA GCT CTG GCT TTT          3952                                                                           Ser Phe Ile Val Leu Met Ile Leu Leu Ser Se - #r Gly Ala Leu Ala Phe            #               22202210 - #                2215                               - GAA GAT ATC TAT ATT GAA AAG AAA AAG ACC AT - #T AAG ATT ATC CTG GAG          4000                                                                           Glu Asp Ile Tyr Ile Glu Lys Lys Lys Thr Il - #e Lys Ile Ile Leu Glu            #              22350                                                           - TAT GCT GAC AAG ATA TTC ACC TAC ATC TTC AT - #T CTG GAA ATG CTT CTA          4048                                                                           Tyr Ala Asp Lys Ile Phe Thr Tyr Ile Phe Il - #e Leu Glu Met Leu Leu            #          22505                                                               - AAA TGG GTC GCA TAT GGG TAT AAA ACA TAT TT - #C ACT AAT GCC TGG TGT          4096                                                                           Lys Trp Val Ala Tyr Gly Tyr Lys Thr Tyr Ph - #e Thr Asn Ala Trp Cys            #      22650                                                                   - TGG CTG GAC TTC TTA ATT GTT GAT GTG TCT CT - #A GTT ACT TTA GTA GCC          4144                                                                           Trp Leu Asp Phe Leu Ile Val Asp Val Ser Le - #u Val Thr Leu Val Ala            #  22805                                                                       - AAC ACT CTT GGC TAC TCA GAC CTT GGC CCC AT - #T AAA TCT CTA CGG ACA          4192                                                                           Asn Thr Leu Gly Tyr Ser Asp Leu Gly Pro Il - #e Lys Ser Leu Arg Thr            #               23002290 - #                2295                               - CTG AGG GCC CTA AGA CCC CTA AGA GCC TTG TC - #T AGA TTT GAA GGA ATG          4240                                                                           Leu Arg Ala Leu Arg Pro Leu Arg Ala Leu Se - #r Arg Phe Glu Gly Met            #              23150                                                           - AGG GTA GTG GTC AAC GCA CTC ATA GGA GCA AT - #C CCT TCC ATC ATG AAC          4288                                                                           Arg Val Val Val Asn Ala Leu Ile Gly Ala Il - #e Pro Ser Ile Met Asn            #          23305                                                               - GTG CTT CTC GTG TGC CTT ATA TTC TGG CTA AT - #A TTT AGC ATC ATG GGA          4336                                                                           Val Leu Leu Val Cys Leu Ile Phe Trp Leu Il - #e Phe Ser Ile Met Gly            #      23450                                                                   - GTC AAT CTG TTT GCT GGC AAG TTC TAT GAG TG - #T GTC AAC ACC ACC GAT          4384                                                                           Val Asn Leu Phe Ala Gly Lys Phe Tyr Glu Cy - #s Val Asn Thr Thr Asp            #  23605                                                                       - GGG TCA CGA TTT CCT ACA TCT CAA GTT GCA AA - #C CGT TCT GAG TGT TTT          4432                                                                           Gly Ser Arg Phe Pro Thr Ser Gln Val Ala As - #n Arg Ser Glu Cys Phe            #               23802370 - #                2375                               - GCC CTG ATG AAC GTT AGT GGA AAT GTG CGA TG - #G AAA AAC CTG AAA GTA          4480                                                                           Ala Leu Met Asn Val Ser Gly Asn Val Arg Tr - #p Lys Asn Leu Lys Val            #              23950                                                           - AAC TTC GAC AAC GTT GGG CTT GGT TAC CTG TC - #G CTG CTT CAA GTT GCA          4528                                                                           Asn Phe Asp Asn Val Gly Leu Gly Tyr Leu Se - #r Leu Leu Gln Val Ala            #          24105                                                               - ACA TTC AAG GGC TGG ATG GAT ATT ATG TAT GC - #A GCA GTT GAC TCT GTT          4576                                                                           Thr Phe Lys Gly Trp Met Asp Ile Met Tyr Al - #a Ala Val Asp Ser Val            #      24250                                                                   - AAT GTA AAT GAA CAG CCG AAA TAC GAA TAC AG - #T CTC TAC ATG TAC ATT          4624                                                                           Asn Val Asn Glu Gln Pro Lys Tyr Glu Tyr Se - #r Leu Tyr Met Tyr Ile            #  24405                                                                       - TAC TTT GTC ATC TTC ATC ATC TTC GGC TCA TT - #C TTC ACG TTG AAC CTG          4672                                                                           Tyr Phe Val Ile Phe Ile Ile Phe Gly Ser Ph - #e Phe Thr Leu Asn Leu            #               24602450 - #                2455                               - TTC ATT GGT GTC ATC ATA GAT AAT TTC AAC CA - #A CAG AAA AAA AAG CTT          4720                                                                           Phe Ile Gly Val Ile Ile Asp Asn Phe Asn Gl - #n Gln Lys Lys Lys Leu            #              24750                                                           - GGA GGT CAA GAT ATC TTT ATG ACA GAA GAA CA - #G AAG AAA TAC TAT AAT          4768                                                                           Gly Gly Gln Asp Ile Phe Met Thr Glu Glu Gl - #n Lys Lys Tyr Tyr Asn            #          24905                                                               - GCA ATG AAG AAG CTT GGG TCC AAA AAA CCA CA - #A AAA CCA ATT CCA AGG          4816                                                                           Ala Met Lys Lys Leu Gly Ser Lys Lys Pro Gl - #n Lys Pro Ile Pro Arg            #      25050                                                                   - CCA GGG AAC AAA TTC CAA GGA TGT ATA TTT GA - #C TTA GTG ACA AAC CAA          4864                                                                           Pro Gly Asn Lys Phe Gln Gly Cys Ile Phe As - #p Leu Val Thr Asn Gln            #  25205                                                                       - GCT TTT GAT ATC ACC ATC ATG GTT CTT ATA TG - #C CTC AAC ATG GTA ACC          4912                                                                           Ala Phe Asp Ile Thr Ile Met Val Leu Ile Cy - #s Leu Asn Met Val Thr            #               25402530 - #                2535                               - ATG ATG GTA GAA AAA GAG GGG CAA ACT GAG TA - #C ATG GAT TAT GTT TTA          4960                                                                           Met Met Val Glu Lys Glu Gly Gln Thr Glu Ty - #r Met Asp Tyr Val Leu            #              25550                                                           - CAC TGG ATC AAC ATG GTC TTC ATT ATC CTG TT - #C ACT GGG GAG TGT GTG          5008                                                                           His Trp Ile Asn Met Val Phe Ile Ile Leu Ph - #e Thr Gly Glu Cys Val            #          25705                                                               - CTG AAG CTA ATC TCC CTC AGA CAT TAC TAC TT - #C ACT GTG GGT TGG AAC          5056                                                                           Leu Lys Leu Ile Ser Leu Arg His Tyr Tyr Ph - #e Thr Val Gly Trp Asn            #      25850                                                                   - ATT TTT GAT TTT GTG GTA GTG ATC CTC TCC AT - #T GTA GGA ATG TTT CTC          5104                                                                           Ile Phe Asp Phe Val Val Val Ile Leu Ser Il - #e Val Gly Met Phe Leu            #  26005                                                                       - GCT GAG ATG ATA GAG AAG TAT TTC GTG TCC CC - #T ACC CTG TTC CGA GTC          5152                                                                           Ala Glu Met Ile Glu Lys Tyr Phe Val Ser Pr - #o Thr Leu Phe Arg Val            #               26202610 - #                2615                               - ATC CGC CTG GCC AGG ATT GGA CGA ATC CTA CG - #C CTG ATC AAA GGC GCC          5200                                                                           Ile Arg Leu Ala Arg Ile Gly Arg Ile Leu Ar - #g Leu Ile Lys Gly Ala            #              26350                                                           - AAG GGG ATC CGC ACT CTG CTC TTT GCT TTG AT - #G ATG TCC CTT CCT GCG          5248                                                                           Lys Gly Ile Arg Thr Leu Leu Phe Ala Leu Me - #t Met Ser Leu Pro Ala            #          26505                                                               - CTG TTC AAC ATC GGC CTC CTG CTT TTC CTG GT - #C ATG TTC ATC TAC GCC          5296                                                                           Leu Phe Asn Ile Gly Leu Leu Leu Phe Leu Va - #l Met Phe Ile Tyr Ala            #      26650                                                                   - ATC TTT GGG ATG TCC AAC TTT GCC TAC GTT AA - #A AAG GAG GCT GGA ATT          5344                                                                           Ile Phe Gly Met Ser Asn Phe Ala Tyr Val Ly - #s Lys Glu Ala Gly Ile            #  26805                                                                       - AAT GAC ATG TTC AAC TTT GAG ACT TTT GGC AA - #C AGC ATG ATC TGC TTG          5392                                                                           Asn Asp Met Phe Asn Phe Glu Thr Phe Gly As - #n Ser Met Ile Cys Leu            #               27002690 - #                2695                               - TTC CAA ATC ACC ACC TCT GCC GGC TGG GAC GG - #A CTG CTG GCC CCC ATC          5440                                                                           Phe Gln Ile Thr Thr Ser Ala Gly Trp Asp Gl - #y Leu Leu Ala Pro Ile            #              27150                                                           - CTC AAC AGC GCA CCT CCC GAC TGT GAC CCT AA - #A AAA GTT CAC CCA GGA          5488                                                                           Leu Asn Ser Ala Pro Pro Asp Cys Asp Pro Ly - #s Lys Val His Pro Gly            #          27305                                                               - AGT TCA GTG GAA GGG GAC TGT GGG AAC CCA TC - #C GTG GGG ATT TTT TAC          5536                                                                           Ser Ser Val Glu Gly Asp Cys Gly Asn Pro Se - #r Val Gly Ile Phe Tyr            #      27450                                                                   - TTT GTC AGC TAC ATC ATC ATA TCC TTC CTG GT - #G GTG GTG AAC ATG TAC          5584                                                                           Phe Val Ser Tyr Ile Ile Ile Ser Phe Leu Va - #l Val Val Asn Met Tyr            #  27605                                                                       - ATC GCT GTC ATC CTG GAG AAC TTC AGC GTC GC - #C ACC GAA GAG AGC ACT          5632                                                                           Ile Ala Val Ile Leu Glu Asn Phe Ser Val Al - #a Thr Glu Glu Ser Thr            #               27802770 - #                2775                               - GAG CCT CTG AGT GAG GAC GAC TTT GAG ATG TT - #C TAC GAG GTC TGG GAG          5680                                                                           Glu Pro Leu Ser Glu Asp Asp Phe Glu Met Ph - #e Tyr Glu Val Trp Glu            #              27950                                                           - AAG TTC GAC CCT GAC GCC ACT CAG TTC ATA GA - #G TTC TGC AAG CTC TCT          5728                                                                           Lys Phe Asp Pro Asp Ala Thr Gln Phe Ile Gl - #u Phe Cys Lys Leu Ser            #          28105                                                               - GAC TTT GCA GCT GCC CTG GAT CCT CCC CTC CT - #C ATC GCA AAG CCA AAC          5776                                                                           Asp Phe Ala Ala Ala Leu Asp Pro Pro Leu Le - #u Ile Ala Lys Pro Asn            #      28250                                                                   - AAA GTC CAG CTC ATT GCC ATG GAC CTG CCC AT - #G GTG AGT GGA GAC CGC          5824                                                                           Lys Val Gln Leu Ile Ala Met Asp Leu Pro Me - #t Val Ser Gly Asp Arg            #  28405                                                                       - ATC CAC TGC CTG GAC ATC TTG TTT GCT TTT AC - #A AAG CGG GTC CTG GGT          5872                                                                           Ile His Cys Leu Asp Ile Leu Phe Ala Phe Th - #r Lys Arg Val Leu Gly            #               28602850 - #                2855                               - GAG GGT GGA GAG ATG GAT TCT CTT CGT TCA CA - #G ATG GAA GAA AGG TTC          5920                                                                           Glu Gly Gly Glu Met Asp Ser Leu Arg Ser Gl - #n Met Glu Glu Arg Phe            #              28750                                                           - ATG TCA GCC AAT CCT TCT AAA GTG TCC TAT GA - #A CCC ATC ACG ACC ACA          5968                                                                           Met Ser Ala Asn Pro Ser Lys Val Ser Tyr Gl - #u Pro Ile Thr Thr Thr            #          28905                                                               - CTG AAG AGA AAA CAA GAG GAG GTG TCC GCG AC - #T ATC ATT CAG CGT GCT          6016                                                                           Leu Lys Arg Lys Gln Glu Glu Val Ser Ala Th - #r Ile Ile Gln Arg Ala            #      29050                                                                   - TAC AGA CGG TAT CGC CTC AGA CAA CAC GTC AA - #G AAT ATA TCG AGT ATA          6064                                                                           Tyr Arg Arg Tyr Arg Leu Arg Gln His Val Ly - #s Asn Ile Ser Ser Ile            #  29205                                                                       - TAC ATA AAA GAT GGA GAC AGG GAT GAT GAT TT - #G CCC AAT AAA GAA GAT          6112                                                                           Tyr Ile Lys Asp Gly Asp Arg Asp Asp Asp Le - #u Pro Asn Lys Glu Asp            #               29402930 - #                2935                               - ACA GTT TTT GAT AAC GTG AAC GAG AAC TCA AG - #T CCG GAA AAG ACA GAT          6160                                                                           Thr Val Phe Asp Asn Val Asn Glu Asn Ser Se - #r Pro Glu Lys Thr Asp            #              29550                                                           - GTA ACT GCC TCA ACC ATC TCG CCA CCT TCC TA - #T GAC AGT GTC ACA AAG          6208                                                                           Val Thr Ala Ser Thr Ile Ser Pro Pro Ser Ty - #r Asp Ser Val Thr Lys            #          29705                                                               - CCA GAT CAA GAG AAA TAT GAA ACA GAC AAA AC - #A GAG AAG GAA GAC AAA          6256                                                                           Pro Asp Gln Glu Lys Tyr Glu Thr Asp Lys Th - #r Glu Lys Glu Asp Lys            #      29850                                                                   - GAG AAA GAT GAA AGC AGG AAA TAGAGCTTTG GTTTTGATA - #C ACTGTTGACA             6307                                                                           Glu Lys Asp Glu Ser Arg Lys                                                    #   2995                                                                       - GCCTGTGAAG GTTGACTCAC TCGTGTTAGT AAGACTCTTT TACGGAGGTC TA - #TCCAAACT        6367                                                                           - CTTTTATCAA AAATTCTCAA GGCAGCACAG CCATTAGCTC TGATCCAACG AG - #GCAGAGGG        6427                                                                           #             6452 CTAT GTTTT                                                  - (2) INFORMATION FOR SEQ ID NO:10:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 1984 amino                                                         (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: protein                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:                                 - Met Ala Met Leu Pro Pro Pro Gly Pro Gln Se - #r Phe Val His Phe Thr          #                 15                                                           - Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Il - #e Ser Glu Glu Lys Ala          #             30                                                               - Lys Glu His Lys Asp Glu Lys Lys Asp Asp Gl - #u Glu Glu Gly Pro Lys          #         45                                                                   - Pro Ser Ser Asp Leu Glu Ala Gly Lys Gln Le - #u Pro Phe Ile Tyr Gly          #     60                                                                       - Asp Ile Pro Pro Gly Met Val Ser Glu Pro Le - #u Glu Asp Leu Asp Pro          # 80                                                                           - Tyr Tyr Ala Asp Lys Lys Thr Phe Ile Val Le - #u Asn Lys Gly Lys Ala          #                 95                                                           - Ile Phe Arg Phe Asn Ala Thr Pro Ala Leu Ty - #r Met Leu Ser Pro Phe          #           110                                                                - Ser Pro Leu Arg Arg Ile Ser Ile Lys Ile Le - #u Val His Ser Leu Phe          #       125                                                                    - Ser Met Leu Ile Met Cys Thr Ile Leu Thr As - #n Cys Ile Phe Met Thr          #   140                                                                        - Leu Ser Asn Pro Pro Glu Trp Thr Lys Asn Va - #l Glu Tyr Thr Phe Thr          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Gly Ile Tyr Thr Phe Glu Ser Leu Ile Lys Il - #e Leu Ala Arg Gly Phe          #               175                                                            - Cys Val Gly Glu Phe Thr Phe Leu Arg Asp Pr - #o Trp Asn Trp Leu Asp          #           190                                                                - Phe Val Val Ile Val Phe Ala Tyr Leu Thr Gl - #u Phe Val Asn Leu Gly          #       205                                                                    - Asn Val Ser Ala Leu Arg Thr Phe Arg Val Le - #u Arg Ala Leu Lys Thr          #   220                                                                        - Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Va - #l Gly Ala Leu Ile Gln          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Ser Val Lys Lys Leu Ser Asp Val Met Ile Le - #u Thr Val Phe Cys Leu          #               255                                                            - Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Ph - #e Met Gly Asn Leu Lys          #           270                                                                - His Lys Cys Phe Arg Lys Glu Leu Glu Glu As - #n Glu Thr Leu Glu Ser          #       285                                                                    - Ile Met Asn Thr Ala Glu Ser Glu Glu Glu Le - #u Lys Lys Tyr Phe Tyr          #   300                                                                        - Tyr Leu Glu Gly Ser Lys Asp Ala Leu Leu Cy - #s Gly Phe Ser Thr Asp          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Ser Gly Gln Cys Pro Glu Gly Tyr Ile Cys Va - #l Lys Ala Gly Arg Asn          #               335                                                            - Pro Asp Tyr Gly Tyr Thr Ser Phe Asp Thr Ph - #e Ser Trp Ala Phe Leu          #           350                                                                - Ala Leu Phe Arg Leu Met Thr Gln Asp Tyr Tr - #p Glu Asn Leu Tyr Gln          #       365                                                                    - Gln Thr Leu Arg Ala Ala Gly Lys Thr Tyr Me - #t Ile Phe Phe Val Val          #   380                                                                        - Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile As - #n Leu Ile Leu Ala Val          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Al - #a Asn Ile Glu Glu Ala          #               415                                                            - Lys Gln Lys Glu Leu Glu Phe Gln Gln Met Le - #u Asp Arg Leu Lys Lys          #           430                                                                - Glu Gln Glu Glu Ala Glu Ala Ile Ala Ala Al - #a Ala Ala Glu Phe Thr          #       445                                                                    - Ser Ile Gly Arg Ser Arg Ile Met Gly Leu Se - #r Glu Ser Ser Ser Glu          #   460                                                                        - Thr Ser Arg Leu Ser Ser Lys Ser Ala Lys Gl - #u Arg Arg Asn Arg Arg          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Lys Lys Lys Lys Gln Lys Met Ser Ser Gly Gl - #u Glu Lys Gly Asp Asp          #               495                                                            - Glu Lys Leu Ser Lys Ser Gly Ser Glu Glu Se - #r Ile Arg Lys Lys Ser          #           510                                                                - Phe His Leu Gly Val Glu Gly His His Arg Th - #r Arg Glu Lys Arg Leu          #       525                                                                    - Ser Thr Pro Asn Gln Ser Pro Leu Ser Ile Ar - #g Gly Ser Leu Phe Ser          #   540                                                                        - Ala Arg Arg Ser Ser Arg Thr Ser Leu Phe Se - #r Phe Lys Gly Arg Gly          545                 5 - #50                 5 - #55                 5 -        #60                                                                            - Arg Asp Leu Gly Ser Glu Thr Glu Phe Ala As - #p Asp Glu His Ser Ile          #               575                                                            - Phe Gly Asp Asn Glu Ser Arg Arg Gly Ser Le - #u Phe Val Pro His Arg          #           590                                                                - Pro Arg Glu Arg Arg Ser Ser Asn Ile Ser Gl - #n Ala Ser Arg Ser Pro          #       605                                                                    - Pro Val Leu Pro Val Asn Gly Lys Met His Se - #r Ala Val Asp Cys Asn          #   620                                                                        - Gly Val Val Ser Leu Val Asp Gly Pro Ser Al - #a Leu Met Leu Pro Asn          625                 6 - #30                 6 - #35                 6 -        #40                                                                            - Gly Gln Leu Leu Pro Glu Val Ile Ile Asp Ly - #s Ala Thr Ser Asp Asp          #               655                                                            - Ser Gly Thr Thr Asn Gln Met Arg Lys Lys Ar - #g Leu Ser Ser Ser Tyr          #           670                                                                - Phe Leu Ser Glu Asp Met Leu Asn Asp Pro Hi - #s Leu Arg Gln Arg Ala          #       685                                                                    - Met Ser Arg Ala Ser Ile Leu Thr Asn Thr Va - #l Glu Glu Leu Glu Glu          #   700                                                                        - Ser Arg Gln Lys Cys Pro Pro Trp Trp Tyr Ar - #g Phe Ala His Thr Phe          705                 7 - #10                 7 - #15                 7 -        #20                                                                            - Leu Ile Trp Asn Cys Ser Pro Tyr Trp Ile Ly - #s Phe Lys Lys Leu Ile          #               735                                                            - Tyr Phe Ile Val Met Asp Pro Phe Val Asp Le - #u Ala Ile Thr Ile Cys          #           750                                                                - Ile Val Leu Asn Thr Leu Phe Met Ala Met Gl - #u His His Pro Met Thr          #       765                                                                    - Glu Glu Phe Lys Asn Val Leu Ala Val Gly As - #n Leu Ile Phe Thr Gly          #   780                                                                        - Ile Phe Ala Ala Glu Met Val Leu Lys Leu Il - #e Ala Met Asp Pro Tyr          785                 7 - #90                 7 - #95                 8 -        #00                                                                            - Glu Tyr Phe Gln Val Gly Trp Asn Ile Phe As - #p Ser Leu Ile Val Thr          #               815                                                            - Leu Ser Leu Ile Glu Leu Phe Leu Ala Asp Va - #l Glu Gly Leu Ser Val          #           830                                                                - Leu Arg Ser Phe Arg Leu Leu Arg Val Phe Ly - #s Leu Ala Lys Ser Trp          #       845                                                                    - Pro Thr Leu Asn Met Leu Ile Lys Ile Ile Gl - #y Asn Ser Val Gly Ala          #   860                                                                        - Leu Gly Asn Leu Thr Leu Val Leu Ala Ile Il - #e Val Phe Ile Phe Ala          865                 8 - #70                 8 - #75                 8 -        #80                                                                            - Val Val Gly Met Gln Leu Phe Gly Lys Ser Ty - #r Lys Glu Cys Val Cys          #               895                                                            - Lys Ile Asn Val Asp Cys Lys Leu Pro Arg Tr - #p His Met Asn Asp Phe          #           910                                                                - Phe His Ser Phe Leu Ile Val Phe Arg Val Le - #u Cys Gly Glu Trp Ile          #       925                                                                    - Glu Thr Met Trp Asp Cys Met Glu Val Ala Gl - #y Gln Thr Met Cys Leu          #   940                                                                        - Ile Val Tyr Met Met Val Met Val Ile Gly As - #n Leu Val Val Leu Asn          945                 9 - #50                 9 - #55                 9 -        #60                                                                            - Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe Se - #r Ser Asp Asn Leu Thr          #               975                                                            - Ala Ile Glu Glu Asp Thr Asp Ala Asn Asn Le - #u Gln Ile Ala Val Ala          #           990                                                                - Arg Ile Lys Arg Gly Ile Asn Tyr Val Lys Gl - #n Thr Leu Arg Glu Phe          #      10050                                                                   - Ile Leu Lys Ser Phe Ser Lys Lys Pro Lys Gl - #y Ser Lys Asp Thr Lys          #  10205                                                                       - Arg Thr Ala Asp Pro Asn Asn Lys Lys Glu As - #n Tyr Ile Ser Asn Arg          #               10401030 - #                1035                               - Thr Leu Ala Glu Met Ser Lys Asp His Asn Ph - #e Leu Lys Glu Lys Asp          #              10550                                                           - Arg Ile Ser Gly Tyr Gly Ser Ser Leu Asp Ly - #s Ser Phe Met Asp Glu          #          10705                                                               - Asn Asp Tyr Gln Ser Phe Ile His Asn Pro Se - #r Leu Thr Val Thr Val          #      10850                                                                   - Pro Ile Ala Pro Gly Glu Ser Asp Leu Glu Il - #e Met Asn Thr Glu Glu          #  11005                                                                       - Leu Ser Ser Asp Ser Asp Ser Asp Tyr Ser Ly - #s Glu Lys Arg Asn Arg          #               11201110 - #                1115                               - Ser Ser Ser Ser Glu Cys Ser Thr Val Asp As - #n Pro Leu Pro Gly Glu          #              11350                                                           - Glu Glu Ala Glu Ala Glu Pro Val Asn Ala As - #p Glu Pro Glu Ala Cys          #          11505                                                               - Phe Thr Asp Gly Cys Val Arg Arg Phe Pro Cy - #s Cys Gln Val Asn Val          #      11650                                                                   - Asp Ser Gly Lys Gly Lys Val Trp Trp Thr Il - #e Arg Lys Thr Cys Tyr          #  11805                                                                       - Arg Ile Val Glu His Ser Trp Phe Glu Ser Ph - #e Ile Val Leu Met Ile          #               12001190 - #                1195                               - Leu Leu Ser Ser Gly Ala Leu Ala Phe Glu As - #p Ile Tyr Ile Glu Lys          #              12150                                                           - Lys Lys Thr Ile Lys Ile Ile Leu Glu Tyr Al - #a Asp Lys Ile Phe Thr          #          12305                                                               - Tyr Ile Phe Ile Leu Glu Met Leu Leu Lys Tr - #p Val Ala Tyr Gly Tyr          #      12450                                                                   - Lys Thr Tyr Phe Thr Asn Ala Trp Cys Trp Le - #u Asp Phe Leu Ile Val          #  12605                                                                       - Asp Val Ser Leu Val Thr Leu Val Ala Asn Th - #r Leu Gly Tyr Ser Asp          #               12801270 - #                1275                               - Leu Gly Pro Ile Lys Ser Leu Arg Thr Leu Ar - #g Ala Leu Arg Pro Leu          #              12950                                                           - Arg Ala Leu Ser Arg Phe Glu Gly Met Arg Va - #l Val Val Asn Ala Leu          #          13105                                                               - Ile Gly Ala Ile Pro Ser Ile Met Asn Val Le - #u Leu Val Cys Leu Ile          #      13250                                                                   - Phe Trp Leu Ile Phe Ser Ile Met Gly Val As - #n Leu Phe Ala Gly Lys          #  13405                                                                       - Phe Tyr Glu Cys Val Asn Thr Thr Asp Gly Se - #r Arg Phe Pro Thr Ser          #               13601350 - #                1355                               - Gln Val Ala Asn Arg Ser Glu Cys Phe Ala Le - #u Met Asn Val Ser Gly          #              13750                                                           - Asn Val Arg Trp Lys Asn Leu Lys Val Asn Ph - #e Asp Asn Val Gly Leu          #          13905                                                               - Gly Tyr Leu Ser Leu Leu Gln Val Ala Thr Ph - #e Lys Gly Trp Met Asp          #      14050                                                                   - Ile Met Tyr Ala Ala Val Asp Ser Val Asn Va - #l Asn Glu Gln Pro Lys          #  14205                                                                       - Tyr Glu Tyr Ser Leu Tyr Met Tyr Ile Tyr Ph - #e Val Ile Phe Ile Ile          #               14401430 - #                1435                               - Phe Gly Ser Phe Phe Thr Leu Asn Leu Phe Il - #e Gly Val Ile Ile Asp          #              14550                                                           - Asn Phe Asn Gln Gln Lys Lys Lys Leu Gly Gl - #y Gln Asp Ile Phe Met          #          14705                                                               - Thr Glu Glu Gln Lys Lys Tyr Tyr Asn Ala Me - #t Lys Lys Leu Gly Ser          #      14850                                                                   - Lys Lys Pro Gln Lys Pro Ile Pro Arg Pro Gl - #y Asn Lys Phe Gln Gly          #  15005                                                                       - Cys Ile Phe Asp Leu Val Thr Asn Gln Ala Ph - #e Asp Ile Thr Ile Met          #               15201510 - #                1515                               - Val Leu Ile Cys Leu Asn Met Val Thr Met Me - #t Val Glu Lys Glu Gly          #              15350                                                           - Gln Thr Glu Tyr Met Asp Tyr Val Leu His Tr - #p Ile Asn Met Val Phe          #          15505                                                               - Ile Ile Leu Phe Thr Gly Glu Cys Val Leu Ly - #s Leu Ile Ser Leu Arg          #      15650                                                                   - His Tyr Tyr Phe Thr Val Gly Trp Asn Ile Ph - #e Asp Phe Val Val Val          #  15805                                                                       - Ile Leu Ser Ile Val Gly Met Phe Leu Ala Gl - #u Met Ile Glu Lys Tyr          #               16001590 - #                1595                               - Phe Val Ser Pro Thr Leu Phe Arg Val Ile Ar - #g Leu Ala Arg Ile Gly          #              16150                                                           - Arg Ile Leu Arg Leu Ile Lys Gly Ala Lys Gl - #y Ile Arg Thr Leu Leu          #          16305                                                               - Phe Ala Leu Met Met Ser Leu Pro Ala Leu Ph - #e Asn Ile Gly Leu Leu          #      16450                                                                   - Leu Phe Leu Val Met Phe Ile Tyr Ala Ile Ph - #e Gly Met Ser Asn Phe          #  16605                                                                       - Ala Tyr Val Lys Lys Glu Ala Gly Ile Asn As - #p Met Phe Asn Phe Glu          #               16801670 - #                1675                               - Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gl - #n Ile Thr Thr Ser Ala          #              16950                                                           - Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu As - #n Ser Ala Pro Pro Asp          #          17105                                                               - Cys Asp Pro Lys Lys Val His Pro Gly Ser Se - #r Val Glu Gly Asp Cys          #      17250                                                                   - Gly Asn Pro Ser Val Gly Ile Phe Tyr Phe Va - #l Ser Tyr Ile Ile Ile          #  17405                                                                       - Ser Phe Leu Val Val Val Asn Met Tyr Ile Al - #a Val Ile Leu Glu Asn          #               17601750 - #                1755                               - Phe Ser Val Ala Thr Glu Glu Ser Thr Glu Pr - #o Leu Ser Glu Asp Asp          #              17750                                                           - Phe Glu Met Phe Tyr Glu Val Trp Glu Lys Ph - #e Asp Pro Asp Ala Thr          #          17905                                                               - Gln Phe Ile Glu Phe Cys Lys Leu Ser Asp Ph - #e Ala Ala Ala Leu Asp          #      18050                                                                   - Pro Pro Leu Leu Ile Ala Lys Pro Asn Lys Va - #l Gln Leu Ile Ala Met          #  18205                                                                       - Asp Leu Pro Met Val Ser Gly Asp Arg Ile Hi - #s Cys Leu Asp Ile Leu          #               18401830 - #                1835                               - Phe Ala Phe Thr Lys Arg Val Leu Gly Glu Gl - #y Gly Glu Met Asp Ser          #              18550                                                           - Leu Arg Ser Gln Met Glu Glu Arg Phe Met Se - #r Ala Asn Pro Ser Lys          #          18705                                                               - Val Ser Tyr Glu Pro Ile Thr Thr Thr Leu Ly - #s Arg Lys Gln Glu Glu          #      18850                                                                   - Val Ser Ala Thr Ile Ile Gln Arg Ala Tyr Ar - #g Arg Tyr Arg Leu Arg          #  19005                                                                       - Gln His Val Lys Asn Ile Ser Ser Ile Tyr Il - #e Lys Asp Gly Asp Arg          #               19201910 - #                1915                               - Asp Asp Asp Leu Pro Asn Lys Glu Asp Thr Va - #l Phe Asp Asn Val Asn          #              19350                                                           - Glu Asn Ser Ser Pro Glu Lys Thr Asp Val Th - #r Ala Ser Thr Ile Ser          #          19505                                                               - Pro Pro Ser Tyr Asp Ser Val Thr Lys Pro As - #p Gln Glu Lys Tyr Glu          #      19650                                                                   - Thr Asp Lys Thr Glu Lys Glu Asp Lys Glu Ly - #s Asp Glu Ser Arg Lys          #  19805                                                                       - (2) INFORMATION FOR SEQ ID NO:11:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 1989 amino                                                         (B) TYPE: amino acid                                                           (C) STRANDEDNESS: Not R - #elevant                                             (D) TOPOLOGY: Not Relev - #ant                                       -     (ii) MOLECULE TYPE: protein                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:                                 - Met Ala Met Leu Pro Pro Pro Gly Pro Gln Se - #r Phe Val His Phe Thr          #                15                                                            - Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Il - #e Xaa Glu Xaa Lys Xaa          #            30                                                                - Lys Glu Xaa Lys Xaa Glu Lys Lys Asp Asp Xa - #a Glu Glu Xaa Pro Lys          #        45                                                                    - Pro Ser Ser Asp Leu Glu Ala Gly Lys Gln Le - #u Pro Phe Ile Tyr Gly          #    60                                                                        - Asp Ile Pro Pro Gly Met Val Ser Glu Pro Le - #u Glu Asp Leu Asp Pro          #80                                                                            - Tyr Tyr Ala Asp Lys Lys Thr Phe Ile Val Le - #u Asn Lys Gly Lys Xaa          #                95                                                            - Ile Phe Arg Phe Asn Ala Thr Pro Ala Leu Ty - #r Met Leu Ser Pro Phe          #           110                                                                - Ser Pro Leu Arg Arg Ile Ser Ile Lys Ile Le - #u Val His Ser Leu Phe          #       125                                                                    - Ser Met Leu Ile Met Cys Thr Ile Leu Thr As - #n Cys Ile Phe Met Thr          #   140                                                                        - Xaa Xaa Asn Pro Pro Xaa Trp Thr Lys Asn Va - #l Xaa Tyr Thr Phe Thr          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Gly Ile Tyr Thr Phe Glu Ser Leu Xaa Lys Il - #e Leu Ala Arg Gly Phe          #               175                                                            - Cys Val Gly Glu Phe Thr Phe Leu Arg Asp Pr - #o Trp Asn Trp Leu Asp          #           190                                                                - Phe Val Val Ile Val Phe Ala Tyr Leu Thr Gl - #u Phe Val Asn Leu Gly          #       205                                                                    - Asn Val Ser Ala Leu Arg Thr Phe Arg Val Le - #u Arg Ala Leu Lys Thr          #   220                                                                        - Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Va - #l Gly Ala Leu Ile Gln          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Ser Val Lys Lys Leu Ser Asp Val Met Ile Le - #u Thr Val Phe Cys Leu          #               255                                                            - Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Ph - #e Met Gly Asn Leu Lys          #           270                                                                - His Lys Cys Phe Arg Xaa Xaa Leu Glu Xaa As - #n Glu Thr Leu Glu Ser          #       285                                                                    - Ile Met Asn Thr Xaa Glu Ser Glu Glu Xaa Xa - #a Xaa Lys Tyr Phe Tyr          #   300                                                                        - Tyr Leu Glu Gly Ser Lys Asp Ala Leu Leu Cy - #s Gly Phe Ser Thr Asp          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Ser Gly Gln Cys Pro Glu Gly Tyr Xaa Cys Va - #l Lys Xaa Gly Arg Asn          #               335                                                            - Pro Asp Tyr Gly Tyr Thr Ser Phe Asp Thr Ph - #e Ser Trp Ala Phe Leu          #           350                                                                - Ala Leu Phe Arg Leu Met Thr Gln Asp Tyr Tr - #p Glu Asn Leu Tyr Gln          #       365                                                                    - Gln Thr Leu Arg Ala Ala Gly Lys Thr Tyr Me - #t Ile Phe Phe Val Val          #   380                                                                        - Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile As - #n Leu Ile Leu Ala Val          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Al - #a Asn Ile Glu Glu Ala          #               415                                                            - Lys Gln Lys Glu Leu Glu Phe Gln Gln Met Le - #u Asp Arg Leu Lys Lys          #           430                                                                - Glu Gln Glu Glu Ala Glu Ala Ile Ala Ala Al - #a Ala Ala Glu Xaa Thr          #       445                                                                    - Ser Ile Xaa Arg Ser Arg Ile Met Gly Leu Se - #r Glu Ser Ser Ser Glu          #   460                                                                        - Thr Ser Xaa Leu Ser Ser Lys Ser Ala Lys Gl - #u Arg Arg Asn Arg Arg          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Lys Lys Lys Xaa Gln Lys Lys Xaa Ser Ser Gl - #y Glu Glu Lys Gly Asp          #               495                                                            - Xaa Glu Lys Leu Ser Lys Ser Xaa Ser Glu Xa - #a Ser Ile Arg Xaa Lys          #           510                                                                - Ser Phe His Leu Gly Val Glu Gly His Xaa Ar - #g Xaa Xaa Glu Lys Arg          #       525                                                                    - Leu Ser Thr Pro Asn Gln Ser Pro Leu Ser Il - #e Arg Gly Ser Leu Phe          #   540                                                                        - Ser Ala Arg Arg Ser Ser Arg Thr Ser Leu Ph - #e Ser Phe Lys Gly Arg          545                 5 - #50                 5 - #55                 5 -        #60                                                                            - Gly Arg Asp Xaa Gly Ser Glu Thr Glu Phe Al - #a Asp Asp Glu His Ser          #               575                                                            - Ile Phe Gly Asp Asn Glu Ser Arg Arg Gly Se - #r Leu Phe Val Pro His          #           590                                                                - Arg Pro Xaa Glu Arg Arg Ser Ser Asn Ile Se - #r Gln Ala Ser Arg Ser          #       605                                                                    - Pro Pro Xaa Leu Pro Val Asn Gly Lys Met Hi - #s Ser Ala Val Asp Cys          #   620                                                                        - Asn Gly Val Val Ser Leu Val Asp Gly Xaa Se - #r Ala Leu Met Leu Pro          625                 6 - #30                 6 - #35                 6 -        #40                                                                            - Asn Gly Gln Leu Leu Pro Glu Xaa Xaa Xaa Xa - #a Xaa Xaa Xaa Xaa Xaa          #               655                                                            - Xaa Xaa Gly Thr Thr Asn Gln Xaa Xaa Lys Ly - #s Arg Xaa Xaa Ser Ser          #           670                                                                - Tyr Xaa Leu Ser Glu Asp Met Leu Asn Asp Pr - #o Xaa Leu Arg Gln Arg          #       685                                                                    - Ala Met Ser Arg Ala Ser Ile Leu Thr Asn Th - #r Val Glu Glu Leu Glu          #   700                                                                        - Glu Ser Arg Gln Lys Cys Xaa Xaa Xaa Xaa Ty - #r Arg Phe Ala His Xaa          705                 7 - #10                 7 - #15                 7 -        #20                                                                            - Phe Leu Ile Trp Asn Cys Ser Pro Tyr Trp Il - #e Lys Phe Lys Lys Xaa          #               735                                                            - Ile Tyr Phe Ile Val Met Asp Pro Phe Val As - #p Leu Ala Ile Thr Ile          #           750                                                                - Cys Ile Val Leu Asn Thr Leu Phe Met Ala Me - #t Glu His His Pro Met          #       765                                                                    - Thr Glu Glu Phe Lys Asn Val Leu Ala Xaa Gl - #y Asn Leu Xaa Phe Thr          #   780                                                                        - Gly Ile Phe Ala Ala Glu Met Val Leu Lys Le - #u Ile Ala Met Asp Pro          785                 7 - #90                 7 - #95                 8 -        #00                                                                            - Tyr Glu Tyr Phe Gln Val Gly Trp Asn Ile Ph - #e Asp Ser Leu Ile Val          #               815                                                            - Thr Leu Ser Leu Xaa Glu Leu Phe Leu Ala As - #p Val Glu Gly Leu Ser          #           830                                                                - Val Leu Arg Ser Phe Arg Leu Leu Arg Val Ph - #e Lys Leu Ala Lys Ser          #       845                                                                    - Trp Pro Thr Leu Asn Met Leu Ile Lys Ile Il - #e Gly Asn Ser Val Gly          #   860                                                                        - Ala Leu Gly Asn Leu Thr Leu Val Leu Ala Il - #e Ile Val Phe Ile Phe          865                 8 - #70                 8 - #75                 8 -        #80                                                                            - Ala Val Val Gly Met Gln Leu Phe Gly Lys Se - #r Tyr Lys Glu Cys Val          #               895                                                            - Cys Lys Ile Asn Xaa Asp Cys Xaa Leu Pro Ar - #g Trp His Met Asn Asp          #           910                                                                - Phe Phe His Ser Phe Leu Ile Val Phe Arg Va - #l Leu Cys Gly Glu Trp          #       925                                                                    - Ile Glu Thr Met Trp Asp Cys Met Glu Val Al - #a Gly Gln Xaa Met Cys          #   940                                                                        - Leu Ile Val Tyr Met Met Val Met Val Ile Gl - #y Asn Leu Val Val Leu          945                 9 - #50                 9 - #55                 9 -        #60                                                                            - Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Ph - #e Ser Ser Asp Asn Leu          #               975                                                            - Thr Ala Ile Glu Glu Asp Xaa Asp Ala Asn As - #n Leu Gln Ile Ala Val          #           990                                                                - Xaa Arg Ile Lys Xaa Gly Ile Asn Tyr Val Ly - #s Gln Thr Leu Arg Glu          #      10050                                                                   - Phe Ile Leu Lys Xaa Phe Ser Lys Lys Pro Ly - #s Xaa Ser Xaa Xaa Xaa          #  10205                                                                       - Xaa Xaa Xaa Xaa Asp Xaa Asn Xaa Lys Lys Gl - #u Asn Tyr Ile Ser Asn          #               10401030 - #                1035                               - Xaa Thr Leu Ala Glu Met Ser Lys Xaa His As - #n Phe Leu Lys Glu Lys          #              10550                                                           - Asp Xaa Ile Ser Gly Xaa Gly Ser Ser Xaa As - #p Lys Xaa Xaa Met Xaa          #          10705                                                               - Xaa Xaa Asp Xaa Gln Ser Phe Ile His Asn Pr - #o Ser Leu Thr Val Thr          #      10850                                                                   - Val Pro Ile Ala Pro Gly Glu Ser Asp Leu Gl - #u Xaa Met Asn Xaa Glu          #  11005                                                                       - Glu Leu Ser Ser Asp Ser Asp Ser Xaa Tyr Se - #r Lys Xaa Xaa Xaa Asn          #               11201110 - #                1115                               - Arg Ser Ser Ser Ser Glu Cys Ser Thr Val As - #p Asn Pro Leu Pro Gly          #              11350                                                           - Glu Gly Glu Glu Ala Glu Ala Glu Pro Xaa As - #n Xaa Asp Glu Pro Glu          #          11505                                                               - Ala Cys Phe Thr Asp Gly Cys Val Arg Arg Ph - #e Xaa Cys Cys Gln Val          #      11650                                                                   - Asn Xaa Xaa Ser Gly Lys Gly Lys Xaa Trp Tr - #p Xaa Ile Arg Lys Thr          #  11805                                                                       - Cys Tyr Xaa Ile Val Glu His Ser Trp Phe Gl - #u Ser Phe Ile Val Leu          #               12001190 - #                1195                               - Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Ph - #e Glu Asp Ile Tyr Ile          #              12150                                                           - Glu Xaa Lys Lys Thr Ile Lys Ile Ile Leu Gl - #u Tyr Ala Asp Lys Ile          #          12305                                                               - Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Le - #u Lys Trp Xaa Ala Tyr          #      12450                                                                   - Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cy - #s Trp Leu Asp Phe Leu          #  12605                                                                       - Ile Val Asp Val Ser Leu Val Thr Leu Val Al - #a Asn Thr Leu Gly Tyr          #               12801270 - #                1275                               - Ser Asp Leu Gly Pro Ile Lys Ser Leu Arg Th - #r Leu Arg Ala Leu Arg          #              12950                                                           - Pro Leu Arg Ala Leu Ser Arg Phe Glu Gly Me - #t Arg Val Val Val Asn          #          13105                                                               - Ala Leu Ile Gly Ala Ile Pro Ser Ile Met As - #n Val Leu Leu Val Cys          #      13250                                                                   - Leu Ile Phe Trp Leu Ile Phe Ser Ile Met Gl - #y Val Asn Leu Phe Ala          #  13405                                                                       - Gly Lys Phe Tyr Glu Cys Xaa Asn Thr Thr As - #p Gly Ser Arg Phe Pro          #               13601350 - #                1355                               - Xaa Ser Gln Val Xaa Asn Arg Ser Glu Cys Ph - #e Ala Leu Met Asn Val          #              13750                                                           - Ser Xaa Asn Val Arg Trp Lys Asn Leu Lys Va - #l Asn Phe Asp Asn Val          #          13905                                                               - Gly Leu Gly Tyr Leu Ser Leu Leu Gln Val Al - #a Thr Phe Lys Gly Trp          #      14050                                                                   - Xaa Xaa Ile Met Tyr Ala Ala Val Asp Ser Va - #l Asn Val Xaa Xaa Gln          #  14205                                                                       - Pro Lys Tyr Glu Tyr Ser Leu Tyr Met Tyr Il - #e Tyr Phe Val Xaa Phe          #               14401430 - #                1435                               - Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Le - #u Phe Ile Gly Val Ile          #              14550                                                           - Ile Asp Asn Phe Asn Gln Gln Lys Lys Lys Le - #u Gly Gly Gln Asp Ile          #          14705                                                               - Phe Met Thr Glu Glu Gln Lys Lys Tyr Tyr As - #n Ala Met Lys Lys Leu          #      14850                                                                   - Gly Ser Lys Lys Pro Gln Lys Pro Ile Pro Ar - #g Pro Gly Asn Lys Xaa          #  15005                                                                       - Gln Gly Cys Ile Phe Asp Leu Val Thr Asn Gl - #n Ala Phe Asp Ile Xaa          #               15201510 - #                1515                               - Ile Met Val Leu Ile Cys Leu Asn Met Val Th - #r Met Met Val Glu Lys          #              15350                                                           - Glu Gly Gln Xaa Xaa Xaa Met Xaa Xaa Val Le - #u Xaa Trp Ile Asn Xaa          #          15505                                                               - Val Phe Ile Ile Leu Phe Thr Gly Glu Cys Va - #l Leu Lys Leu Ile Ser          #      15650                                                                   - Leu Arg His Tyr Tyr Phe Thr Val Gly Trp As - #n Ile Xaa Xaa Phe Val          #  15805                                                                       - Val Val Ile Xaa Ser Ile Val Gly Met Phe Le - #u Ala Xaa Xaa Ile Glu          #               16001590 - #                1595                               - Xaa Tyr Phe Val Ser Pro Thr Leu Phe Arg Va - #l Ile Arg Leu Ala Arg          #              16150                                                           - Ile Gly Arg Ile Leu Arg Leu Xaa Lys Gly Al - #a Lys Gly Ile Arg Thr          #          16305                                                               - Leu Leu Phe Ala Leu Met Met Ser Leu Pro Al - #a Leu Phe Asn Ile Gly          #      16450                                                                   - Leu Leu Leu Phe Leu Val Met Phe Ile Tyr Al - #a Ile Phe Gly Met Ser          #  16605                                                                       - Asn Phe Ala Tyr Val Lys Lys Glu Xaa Gly Il - #e Asn Asp Met Phe Asn          #               16801670 - #                1675                               - Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Le - #u Phe Gln Ile Thr Thr          #              16950                                                           - Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Il - #e Leu Asn Ser Xaa Pro          #          17105                                                               - Pro Asp Cys Asp Pro Lys Lys Val His Pro Gl - #y Ser Ser Val Glu Gly          #      17250                                                                   - Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Ty - #r Phe Val Ser Tyr Ile          #  17405                                                                       - Ile Ile Ser Phe Leu Val Val Val Asn Met Ty - #r Ile Ala Val Ile Leu          #               17601750 - #                1755                               - Glu Asn Phe Ser Val Ala Thr Glu Glu Ser Th - #r Glu Pro Leu Ser Glu          #              17750                                                           - Asp Asp Phe Glu Met Phe Tyr Glu Val Trp Gl - #u Lys Phe Asp Pro Asp          #          17905                                                               - Ala Thr Gln Phe Ile Glu Phe Xaa Lys Leu Se - #r Asp Phe Ala Ala Ala          #      18050                                                                   - Leu Asp Pro Pro Leu Leu Ile Ala Lys Pro As - #n Lys Val Gln Leu Ile          #  18205                                                                       - Ala Met Asp Leu Pro Met Val Ser Gly Asp Ar - #g Ile His Cys Leu Asp          #               18401830 - #                1835                               - Ile Leu Phe Ala Phe Thr Lys Arg Val Leu Gl - #y Glu Xaa Gly Glu Met          #              18550                                                           - Asp Ser Leu Arg Ser Gln Met Glu Glu Arg Ph - #e Met Ser Ala Asn Pro          #          18705                                                               - Ser Lys Val Ser Tyr Glu Pro Ile Thr Thr Th - #r Leu Lys Arg Lys Gln          #      18850                                                                   - Glu Xaa Val Ser Ala Thr Xaa Ile Gln Arg Al - #a Tyr Arg Arg Tyr Arg          #  19005                                                                       - Leu Arg Gln Xaa Val Lys Asn Ile Ser Ser Il - #e Tyr Ile Lys Asp Gly          #               19201910 - #                1915                               - Asp Arg Asp Asp Asp Leu Xaa Asn Lys Xaa As - #p Xaa Xaa Phe Asp Asn          #              19350                                                           - Val Asn Glu Asn Ser Ser Pro Glu Lys Thr As - #p Xaa Thr Xaa Ser Thr          #          19505                                                               - Xaa Ser Pro Pro Ser Tyr Asp Ser Val Thr Ly - #s Pro Asp Xaa Glu Lys          #      19650                                                                   - Tyr Glu Xaa Asp Xaa Thr Glu Lys Glu Asp Ly - #s Xaa Lys Asp Ser Lys          #  19805                                                                       - Glu Ser Xaa Lys Xaa                                                          1985                                                                           - (2) INFORMATION FOR SEQ ID NO:12:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 1989 amino                                                         (B) TYPE: amino acid                                                           (C) STRANDEDNESS: Not R - #elevant                                             (D) TOPOLOGY: Not Relev - #ant                                       -     (ii) MOLECULE TYPE: protein                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:                                 - Met Ala Met Leu Pro Pro Pro Gly Pro Gln Se - #r Phe Val His Phe Thr          #                15                                                            - Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Il - #e Ser Glu Glu Lys Ala          #            30                                                                - Lys Glu His Lys Asp Glu Lys Lys Asp Asp Gl - #u Glu Glu Gly Pro Lys          #        45                                                                    - Pro Ser Ser Asp Leu Glu Ala Gly Lys Gln Le - #u Pro Phe Ile Tyr Gly          #    60                                                                        - Asp Ile Pro Pro Gly Met Val Ser Glu Pro Le - #u Glu Asp Leu Asp Pro          #80                                                                            - Tyr Tyr Ala Asp Lys Lys Thr Phe Ile Val Le - #u Asn Lys Gly Lys Ala          #                95                                                            - Ile Phe Arg Phe Asn Ala Thr Pro Ala Leu Ty - #r Met Leu Ser Pro Phe          #           110                                                                - Ser Pro Leu Arg Arg Ile Ser Ile Lys Ile Le - #u Val His Ser Leu Phe          #       125                                                                    - Ser Met Leu Ile Met Cys Thr Ile Leu Thr As - #n Cys Ile Phe Met Thr          #   140                                                                        - Leu Ser Asn Pro Pro Glu Trp Thr Lys Asn Va - #l Gly Tyr Thr Phe Thr          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Gly Ile Tyr Thr Phe Glu Ser Leu Ile Lys Il - #e Leu Ala Arg Gly Phe          #               175                                                            - Cys Val Gly Glu Phe Thr Phe Leu Arg Asp Pr - #o Trp Asn Trp Leu Asp          #           190                                                                - Phe Val Val Ile Val Phe Ala Tyr Leu Thr Gl - #u Phe Val Asn Leu Gly          #       205                                                                    - Asn Val Ser Ala Leu Arg Thr Phe Arg Val Le - #u Arg Ala Leu Lys Thr          #   220                                                                        - Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Va - #l Gly Ala Leu Ile Gln          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Ser Val Lys Lys Leu Ser Asp Val Met Ile Le - #u Thr Val Phe Cys Leu          #               255                                                            - Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Ph - #e Met Gly Asn Leu Lys          #           270                                                                - His Lys Cys Phe Arg Lys Glu Leu Glu Glu As - #n Glu Thr Leu Glu Ser          #       285                                                                    - Ile Met Asn Thr Ala Glu Ser Glu Glu Glu Le - #u Lys Lys Tyr Phe Tyr          #   300                                                                        - Tyr Leu Glu Gly Ser Lys Asp Ala Leu Leu Cy - #s Gly Phe Ser Thr Asp          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Ser Gly Gln Cys Pro Glu Gly Tyr Ile Cys Va - #l Lys Ala Gly Arg Asn          #               335                                                            - Pro Asp Tyr Gly Tyr Thr Ser Phe Asp Thr Ph - #e Ser Trp Ala Phe Leu          #           350                                                                - Ala Leu Phe Arg Leu Met Thr Gln Asp Tyr Tr - #p Glu Asn Leu Tyr Gln          #       365                                                                    - Gln Thr Leu Arg Ala Ala Gly Lys Thr Tyr Me - #t Ile Phe Phe Val Val          #   380                                                                        - Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile As - #n Leu Ile Leu Ala Val          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Al - #a Asn Ile Glu Glu Ala          #               415                                                            - Lys Gln Lys Glu Leu Glu Phe Gln Gln Met Le - #u Asp Arg Leu Lys Lys          #           430                                                                - Glu Gln Glu Glu Ala Glu Ala Ile Ala Ala Al - #a Ala Ala Glu Phe Thr          #       445                                                                    - Ser Ile Arg Arg Ser Arg Ile Met Gly Leu Se - #r Glu Ser Ser Ser Glu          #   460                                                                        - Thr Ser Arg Leu Ser Ser Lys Ser Ala Lys Gl - #u Arg Arg Asn Arg Arg          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Lys Lys Lys Lys Gln Lys Xaa Met Ser Ser Gl - #y Glu Glu Lys Gly Asp          #               495                                                            - Asp Glu Lys Leu Ser Lys Ser Gly Ser Glu Gl - #u Ser Ile Arg Lys Lys          #           510                                                                - Ser Phe His Leu Gly Val Glu Gly His His Ar - #g Thr Arg Glu Lys Arg          #       525                                                                    - Leu Ser Thr Pro Asn Gln Ser Pro Leu Ser Il - #e Arg Gly Ser Leu Phe          #   540                                                                        - Ser Ala Arg Arg Ser Ser Arg Thr Ser Leu Ph - #e Ser Phe Lys Gly Arg          545                 5 - #50                 5 - #55                 5 -        #60                                                                            - Gly Arg Asp Leu Gly Ser Glu Thr Glu Phe Al - #a Asp Asp Glu His Ser          #               575                                                            - Ile Phe Gly Asp Asn Glu Ser Arg Arg Gly Se - #r Leu Phe Val Pro His          #           590                                                                - Arg Pro Arg Glu Arg Arg Ser Ser Asn Ile Se - #r Gln Ala Ser Arg Ser          #       605                                                                    - Pro Pro Val Leu Pro Val Asn Gly Lys Met Hi - #s Ser Ala Val Asp Cys          #   620                                                                        - Asn Gly Val Val Ser Leu Val Asp Gly Pro Se - #r Ala Leu Met Leu Pro          625                 6 - #30                 6 - #35                 6 -        #40                                                                            - Asn Gly Gln Leu Leu Pro Glu Val Ile Ile As - #p Lys Ala Thr Ser Asp          #               655                                                            - Asp Ser Gly Thr Thr Asn Gln Met Arg Lys Ly - #s Arg Leu Ser Ser Ser          #           670                                                                - Tyr Phe Leu Ser Glu Asp Met Leu Asn Asp Pr - #o His Leu Arg Gln Arg          #       685                                                                    - Ala Met Ser Arg Ala Ser Ile Leu Thr Asn Th - #r Val Glu Glu Leu Glu          #   700                                                                        - Glu Ser Arg Gln Lys Cys His Gln Leu Leu Ty - #r Arg Phe Ala His Thr          705                 7 - #10                 7 - #15                 7 -        #20                                                                            - Phe Leu Ile Trp Asn Cys Ser Pro Tyr Trp Il - #e Lys Phe Lys Lys Leu          #               735                                                            - Ile Tyr Phe Ile Val Met Asp Pro Phe Val As - #p Leu Ala Ile Thr Ile          #           750                                                                - Cys Ile Val Leu Asn Thr Leu Phe Met Ala Me - #t Glu His His Pro Met          #       765                                                                    - Thr Glu Glu Phe Lys Asn Val Leu Ala Val Gl - #y Asn Leu Ile Phe Thr          #   780                                                                        - Gly Ile Phe Ala Ala Glu Met Val Leu Lys Le - #u Ile Ala Met Asp Pro          785                 7 - #90                 7 - #95                 8 -        #00                                                                            - Tyr Glu Tyr Phe Gln Val Gly Trp Asn Ile Ph - #e Asp Ser Leu Ile Val          #               815                                                            - Thr Leu Ser Leu Ile Glu Leu Phe Leu Ala As - #p Val Glu Gly Leu Ser          #           830                                                                - Val Leu Arg Ser Phe Arg Leu Leu Arg Val Ph - #e Lys Leu Ala Lys Ser          #       845                                                                    - Trp Pro Thr Leu Asn Met Leu Ile Lys Ile Il - #e Gly Asn Ser Val Gly          #   860                                                                        - Ala Leu Gly Asn Leu Thr Leu Val Leu Ala Il - #e Ile Val Phe Ile Phe          865                 8 - #70                 8 - #75                 8 -        #80                                                                            - Ala Val Val Gly Met Gln Leu Phe Gly Lys Se - #r Tyr Lys Glu Cys Val          #               895                                                            - Cys Lys Ile Asn Val Asp Cys Lys Leu Pro Ar - #g Trp His Met Asn Asp          #           910                                                                - Phe Phe His Ser Phe Leu Ile Val Phe Arg Va - #l Leu Cys Gly Glu Trp          #       925                                                                    - Ile Glu Thr Met Trp Asp Cys Met Glu Val Al - #a Gly Gln Thr Met Cys          #   940                                                                        - Leu Ile Val Tyr Met Met Val Met Val Ile Gl - #y Asn Leu Val Val Leu          945                 9 - #50                 9 - #55                 9 -        #60                                                                            - Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Ph - #e Ser Ser Asp Asn Leu          #               975                                                            - Thr Ala Ile Glu Glu Asp Thr Asp Ala Asn As - #n Leu Gln Ile Ala Val          #           990                                                                - Ala Arg Ile Lys Arg Gly Ile Asn Tyr Val Ly - #s Gln Thr Leu Arg Glu          #      10050                                                                   - Phe Ile Leu Lys Ser Phe Ser Lys Lys Pro Ly - #s Gly Ser Lys Asp Thr          #  10205                                                                       - Lys Arg Thr Ala Asp Pro Asn Asn Lys Lys Gl - #u Asn Tyr Ile Ser Asn          #               10401030 - #                1035                               - Arg Thr Leu Ala Glu Met Ser Lys Asp His As - #n Phe Leu Lys Glu Lys          #              10550                                                           - Asp Arg Ile Ser Gly Tyr Gly Ser Ser Leu As - #p Lys Ser Phe Met Asp          #          10705                                                               - Glu Asn Asp Tyr Gln Ser Phe Ile His Asn Pr - #o Ser Leu Thr Val Thr          #      10850                                                                   - Val Pro Ile Ala Pro Gly Glu Ser Asp Leu Gl - #u Ile Met Asn Thr Glu          #  11005                                                                       - Glu Leu Ser Ser Asp Ser Asp Ser Asp Tyr Se - #r Lys Glu Lys Arg Asn          #               11201110 - #                1115                               - Arg Ser Ser Ser Ser Glu Cys Ser Thr Val As - #p Asn Pro Leu Pro Gly          #              11350                                                           - Glu Xaa Glu Glu Ala Glu Ala Glu Pro Val As - #n Ala Asp Glu Pro Glu          #          11505                                                               - Ala Cys Phe Thr Asp Gly Cys Val Arg Arg Ph - #e Pro Cys Cys Gln Val          #      11650                                                                   - Asn Val Asp Ser Gly Lys Gly Lys Val Trp Tr - #p Thr Ile Arg Lys Thr          #  11805                                                                       - Cys Tyr Arg Ile Val Glu His Ser Trp Phe Gl - #u Ser Phe Ile Val Leu          #               12001190 - #                1195                               - Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Ph - #e Glu Asp Ile Tyr Ile          #              12150                                                           - Glu Lys Lys Lys Thr Ile Lys Ile Ile Leu Gl - #u Tyr Ala Asp Lys Ile          #          12305                                                               - Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Le - #u Lys Trp Val Ala Tyr          #      12450                                                                   - Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cy - #s Trp Leu Asp Phe Leu          #  12605                                                                       - Ile Val Asp Val Ser Leu Val Thr Leu Val Al - #a Asn Thr Leu Gly Tyr          #               12801270 - #                1275                               - Ser Asp Leu Gly Pro Ile Lys Ser Leu Arg Th - #r Leu Arg Ala Leu Arg          #              12950                                                           - Pro Leu Arg Ala Leu Ser Arg Phe Glu Gly Me - #t Arg Val Val Val Asn          #          13105                                                               - Ala Leu Ile Gly Ala Ile Pro Ser Ile Met As - #n Val Leu Leu Val Cys          #      13250                                                                   - Leu Ile Phe Trp Leu Ile Phe Ser Ile Met Gl - #y Val Asn Leu Phe Ala          #  13405                                                                       - Gly Lys Phe Tyr Glu Cys Val Asn Thr Thr As - #p Gly Ser Arg Phe Pro          #               13601350 - #                1355                               - Thr Ser Gln Val Ala Asn Arg Ser Glu Cys Ph - #e Ala Leu Met Asn Val          #              13750                                                           - Ser Gly Asn Val Arg Trp Lys Asn Leu Lys Va - #l Asn Phe Asp Asn Val          #          13905                                                               - Gly Leu Gly Tyr Leu Ser Leu Leu Gln Val Al - #a Thr Phe Lys Gly Trp          #      14050                                                                   - Met Asp Ile Met Tyr Ala Ala Val Asp Ser Va - #l Asn Val Asn Glu Gln          #  14205                                                                       - Pro Lys Tyr Glu Tyr Ser Leu Tyr Met Tyr Il - #e Tyr Phe Val Ile Phe          #               14401430 - #                1435                               - Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Le - #u Phe Ile Gly Val Ile          #              14550                                                           - Ile Asp Asn Phe Asn Gln Gln Lys Lys Lys Le - #u Gly Gly Gln Asp Ile          #          14705                                                               - Phe Met Thr Glu Glu Gln Lys Lys Tyr Tyr As - #n Ala Met Lys Lys Leu          #      14850                                                                   - Gly Ser Lys Lys Pro Gln Lys Pro Ile Pro Ar - #g Pro Gly Asn Lys Phe          #  15005                                                                       - Gln Gly Cys Ile Phe Asp Leu Val Thr Asn Gl - #n Ala Phe Asp Ile Thr          #               15201510 - #                1515                               - Ile Met Val Leu Ile Cys Leu Asn Met Val Th - #r Met Met Val Glu Lys          #              15350                                                           - Glu Gly Gln Thr Glu Tyr Met Asp Tyr Val Le - #u His Trp Ile Asn Met          #          15505                                                               - Val Phe Ile Ile Leu Phe Thr Gly Glu Cys Va - #l Leu Lys Leu Ile Ser          #      15650                                                                   - Leu Arg His Tyr Tyr Phe Thr Val Gly Trp As - #n Ile Leu Tyr Phe Val          #  15805                                                                       - Val Val Ile Leu Ser Ile Val Gly Met Phe Le - #u Ala Glu Met Ile Glu          #               16001590 - #                1595                               - Lys Tyr Phe Val Ser Pro Thr Leu Phe Arg Va - #l Ile Arg Leu Ala Arg          #              16150                                                           - Ile Gly Arg Ile Leu Arg Leu Ile Lys Gly Al - #a Lys Gly Ile Arg Thr          #          16305                                                               - Leu Leu Phe Ala Leu Met Met Ser Leu Pro Al - #a Leu Phe Asn Ile Gly          #      16450                                                                   - Leu Leu Leu Phe Leu Val Met Phe Ile Tyr Al - #a Ile Phe Gly Met Ser          #  16605                                                                       - Asn Phe Ala Tyr Val Lys Lys Glu Ala Gly Il - #e Asn Asp Met Phe Asn          #               16801670 - #                1675                               - Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Le - #u Phe Gln Ile Thr Thr          #              16950                                                           - Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Il - #e Leu Asn Ser Ala Pro          #          17105                                                               - Pro Asp Cys Asp Pro Lys Lys Val His Pro Gl - #y Ser Ser Val Glu Gly          #      17250                                                                   - Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Ty - #r Phe Val Ser Tyr Ile          #  17405                                                                       - Ile Ile Ser Phe Leu Val Val Val Asn Met Ty - #r Ile Ala Val Ile Leu          #               17601750 - #                1755                               - Glu Asn Phe Ser Val Ala Thr Glu Glu Ser Th - #r Glu Pro Leu Ser Glu          #              17750                                                           - Asp Asp Phe Glu Met Phe Tyr Glu Val Trp Gl - #u Lys Phe Asp Pro Asp          #          17905                                                               - Ala Thr Gln Phe Ile Glu Phe Cys Lys Leu Se - #r Asp Phe Ala Ala Ala          #      18050                                                                   - Leu Asp Pro Pro Leu Leu Ile Ala Lys Pro As - #n Lys Val Gln Leu Ile          #  18205                                                                       - Ala Met Asp Leu Pro Met Val Ser Gly Asp Ar - #g Ile His Cys Leu Asp          #               18401830 - #                1835                               - Ile Leu Phe Ala Phe Thr Lys Arg Val Leu Gl - #y Glu Gly Gly Glu Met          #              18550                                                           - Asp Ser Leu Arg Ser Gln Met Glu Glu Arg Ph - #e Met Ser Ala Asn Pro          #          18705                                                               - Ser Lys Val Ser Tyr Glu Pro Ile Thr Thr Th - #r Leu Lys Arg Lys Gln          #      18850                                                                   - Glu Glu Val Ser Ala Thr Ile Ile Gln Arg Al - #a Tyr Arg Arg Tyr Arg          #  19005                                                                       - Leu Arg Gln His Val Lys Asn Ile Ser Ser Il - #e Tyr Ile Lys Asp Gly          #               19201910 - #                1915                               - Asp Arg Asp Asp Asp Leu Pro Asn Lys Glu As - #p Thr Val Phe Asp Asn          #              19350                                                           - Val Asn Glu Asn Ser Ser Pro Glu Lys Thr As - #p Val Thr Ala Ser Thr          #          19505                                                               - Ile Ser Pro Pro Ser Tyr Asp Ser Val Thr Ly - #s Pro Asp Gln Glu Lys          #      19650                                                                   - Tyr Glu Thr Asp Lys Thr Glu Lys Glu Asp Ly - #s Glu Lys Asp Xaa Xaa          #  19805                                                                       - Glu Ser Arg Lys Xaa                                                          1985                                                                           - (2) INFORMATION FOR SEQ ID NO:13:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 6371 base                                                          (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: both                                                         (D) TOPOLOGY: both                                                   -     (ii) MOLECULE TYPE: DNA (genomic)                                        -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:                                 - CTCTTATGTG AGGAGCTGAA GAGGAATTAA AATATACAGG ATGAAAAGAT GG - #CAATGTTG          60                                                                           - CCTCCCCCAG GACCTCAGAG CTTTGTCCAT TTCACAAAAC AGTCTCTTGC CC - #TCATTGAA         120                                                                           - CAACGCATTG CTGAAAGAAA ATCAAAGGAA CCCAAAGAAG AAAAGAAAGA TG - #ATGATGAA         180                                                                           - GAAGCCCCAA AGCCAAGCAG TGACTTGGAA GCTGGCAAAC AACTGCCCTT CA - #TCTATGGG         240                                                                           - GACATTCCTC CCGGCATGGT GTCAGAGCCC CTGGAGGACT TGGACCCCTA CT - #ATGCAGAC         300                                                                           - AAAAAGACTT TCATAGTATT GAACAAAGGG AAAACAATCT TCCGTTTCAA TG - #CCACACCT         360                                                                           - GCTTTATATA TGCTTTCTCC TTTCAGTCCT CTAAGAAGAA TATCTATTAA GA - #TTTTAGTA         420                                                                           - CACTCCTTAT TCAGCATGCT CATCATGTGC ACTATTCTGA CAAACTGCAT AT - #TTATGACC         480                                                                           - ATGAATAACC CGCCGGACTG GACCAAAAAT GTCGAGTACA CTTTTACTGG AA - #TATATACT         540                                                                           - TTTGAATCAC TTGTAAAAAT CCTTGCAAGA GGCTTCTGTG TAGGAGAATT CA - #CTTTTCTT         600                                                                           - CGTGACCCGT GGAACTGGCT GGATTTTGTC GTCATTGTTT TTGCGTATTT AA - #CAGAATTT         660                                                                           - GTAAACCTAG GCAATGTTTC AGCTCTTCGA ACTTTCAGAG TATTGAGAGC TT - #TGAAAACT         720                                                                           - ATTTCTGTAA TCCCAGGCCT GAAGACAATT GTAGGGGCTT TGATCCAGTC AG - #TGAAGAAG         780                                                                           - CTTTCTGATG TCATGATCCT GACTGTGTTC TGTCTGAGTG TGTTTGCACT AA - #TTGGACTA         840                                                                           - CAGCTGTTCA TGGGAAACCT GAAGCATAAA TGTTTTCGAA ATTCACTTGA AA - #ATAATGAA         900                                                                           - ACATTAGAAA GCATAATGAA TACCCTAGAG AGTGAAGAAG ACTTTAGAAA AT - #ATTTTTAT         960                                                                           - TACTTGGAAG GATCCAAAGA TGCTCTCCTT TGTGGTTTCA GCACAGATTC AG - #GTCAGTGT        1020                                                                           - CCAGAGGGGT ACACCTGTGT GAAAATTGGC AGAAACCCTG ATTATGGCTA CA - #CGAGCTTT        1080                                                                           - GACACTTTCA GCTGGGCCTT CTTAGCCTTG TTTAGGCTAA TGACCCAAGA TT - #ACTGGGAA        1140                                                                           - AACCTTTACC AACAGACGCT GCGTGCTGCT GGCAAAACCT ACATGATCTT CT - #TTGTCGTA        1200                                                                           - GTGATTTTCC TGGGCTCCTT TTATCTAATA AACTTGATCC TGGCTGTGGT TG - #CCATGGCA        1260                                                                           - TATGAAGAAC AGAACCAGGC AAACATTGAA GAAGCTAAAC AGAAAGAATT AG - #AATTTCAA        1320                                                                           - CAGATGTTAG ACCGTCTTAA AAAAGAGCAA GAAGAAGCTG AGGCAATTGC AG - #CGGCAGCG        1380                                                                           - GCTGAATATA CAAGTATTAG GAGAAGCAGA ATTATGGGCC TCTCAGAGAG TT - #CTTCTGAA        1440                                                                           - ACATCCAAAC TGAGCTCTAA AAGTGCTAAA GAAAGAAGAA ACAGAAGAAA GA - #AAAAGAAT        1500                                                                           - CAAAAGAAGC TCTCCAGTGG AGAGGAAAAG GGAGATGCTG AGAAATTGTC GA - #AATCAGAA        1560                                                                           - TCAGAGGACA GCATCAGAAG AAAAAGTTTC CACCTTGGTG TCGAAGGGCA TA - #GGCGAGCA        1620                                                                           - CATGAAAAGA GGTTGTCTAC CCCCAATCAG TCACCACTCA GCATTCGTGG CT - #CCTTGTTT        1680                                                                           - TCTGCAAGGC GAAGCAGCAG AACAAGTCTT TTTAGTTTCA AAGGCAGAGG AA - #GAGATATA        1740                                                                           - GGATCTGAGA CTGAATTTGC CGATGATGAG CACAGCATTT TTGGAGACAA TG - #AGAGCAGA        1800                                                                           - AGGGGCTCAC TGTTTGTGCC CCACAGACCC CAGGAGCGAC GCAGCAGTAA CA - #TCAGCCAA        1860                                                                           - GCCAGTAGGT CCCCACCAAT GCTGCCGGTG AACGGGAAAA TGCACAGTGC TG - #TGGACTGC        1920                                                                           - AACGGTGTGG TCTCCCTGGT TGATGGACGC TCAGCCCTCA TGCTCCCCAA TG - #GACAGCTT        1980                                                                           - CTGCCAGAGG GCACGACCAA TCAAATACAC AAGAAAAGGC GTTGTAGTTC CT - #ATCTCCTT        2040                                                                           - TCAGAGGATA TGCTGAATGA TCCCAACCTC AGACAGAGAG CAATGAGTAG AG - #CAAGCATA        2100                                                                           - TTAACAAACA CTGTGGAAGA ACTTGAAGAG TCCAGACAAA AATGTCCACC TT - #GGTGGTAC        2160                                                                           - AGATTTGCAC ACAAATTCTT GATCTGGAAT TGCTCTCCAT ATTGGATAAA AT - #TCAAAAAG        2220                                                                           - TGTATCTATT TTATTGTAAT GGATCCTTTT GTAGATCTTG CAATTACCAT TT - #GCATAGTT        2280                                                                           - TTAAACACAT TATTTATGGC TATGGAACAC CACCCAATGA CTGAGGAATT CA - #AAAATGTA        2340                                                                           - CTTGCTATAG GAAATTTGGT CTTTACTGGA ATCTTTGCAG CTGAAATGGT AT - #TAAAACTG        2400                                                                           - ATTGCCATGG ATCCATATGA GTATTTCCAA GTAGGCTGGA ATATTTTTGA CA - #GCCTTATT        2460                                                                           - GTGACTTTAA GTTTAGTGGA GCTCTTTCTA GCAGATGTGG AAGGATTGTC AG - #TTCTGCGA        2520                                                                           - TCATTCAGAC TGCTCCGAGT CTTCAAGTTG GCAAAATCCT GGCCAACATT GA - #ACATGCTG        2580                                                                           - ATTAAGATCA TTGGTAACTC AGTAGGGGCT CTAGGTAACC TCACCTTAGT GT - #TGGCCATC        2640                                                                           - ATCGTCTTCA TTTTTGCTGT GGTCGGCATG CAGCTCTTTG GTAAGAGCTA CA - #AAGAATGT        2700                                                                           - GTCTGCAAGA TCAATGATGA CTGTACGCTC CCACGGTGGC ACATGAACGA CT - #TCTTCCAC        2760                                                                           - TCCTTCCTGA TTGTGTTCCG CGTGCTGTGT GGAGAGTGGA TAGAGACCAT GT - #GGGACTGT        2820                                                                           - ATGGAGGTCG CTGGTCAAGC TATGTGCCTT ATTGTTTACA TGATGGTCAT GG - #TCATTGGA        2880                                                                           - AACCTGGTGG TCCTAAACCT ATTTCTGGCC TTATTATTGA GCTCATTTAG TT - #CAGACAAT        2940                                                                           - CTTACAGCAA TTGAAGAAGA CCCTGATGCA AACAACCTCC AGATTGCAGT GA - #CTAGAATT        3000                                                                           - AAAAAGGGAA TAAATTATGT GAAACAAACC TTACGTGAAT TTATTCTAAA AG - #CATTTTCC        3060                                                                           - AAAAAGCCAA AGATTTCCAG GGAGATAAGA CAAGCAGAAG ATCTGAATAC TA - #AGAAGGAA        3120                                                                           - AACTATATTT CTAACCATAC ACTTGCTGAA ATGAGCAAAG GTCACAATTT CC - #TCAAGGAA        3180                                                                           - AAAGATAAAA TCAGTGGTTT TGGAAGCAGC GTGGACAAAC ACTTGATGGA AG - #ACAGTGAT        3240                                                                           - GGTCAATCAT TTATTCACAA TCCCAGCCTC ACAGTGACAG TGCCAATTGC AC - #CTGGGGAA        3300                                                                           - TCCGATTTGG AAAATATGAA TGCTGAGGAA CTTAGCAGTG ATTCGGATAG TG - #AATACAGC        3360                                                                           - AAAGTGAGAT TAAACCGGTC AAGCTCCTCA GAGTGCAGCA CAGTTGATAA CC - #CTTTGCCT        3420                                                                           - GGAGAAGGAG AAGAAGCAGA GGCTGAACCT ATGAATTCCG ATGAGCCAGA GG - #CCTGTTTC        3480                                                                           - ACAGATGGTT GTGTACGGAG GTTCTCATGC TGCCAAGTTA ACATAGAGTC AG - #GGAAAGGA        3540                                                                           - AAAATCTGGT GGAACATCAG GAAAACCTGC TACAAGATTG TTGAACACAG TT - #GGTTTGAA        3600                                                                           - AGCTTCATTG TCCTCATGAT CCTGCTCAGC AGTGGTGCCC TGGCTTTTGA AG - #ATATTTAT        3660                                                                           - ATTGAAAGGA AAAAGACCAT TAAGATTATC CTGGAGTATG CAGACAAGAT CT - #TCACTTAC        3720                                                                           - ATCTTCATTC TGGAAATGCT TCTAAAATGG ATAGCATATG GTTATAAAAC AT - #ATTTCACC        3780                                                                           - AATGCCTGGT GTTGGCTGGA TTTCCTAATT GTTGATGTTT CTTTGGTTAC TT - #TAGTGGCA        3840                                                                           - AACACTCTTG GCTACTCAGA TCTTGGCCCC ATTAAATCCC TTCGGACACT GA - #GAGCTTTA        3900                                                                           - AGACCTCTAA GAGCCTTATC TAGATTTGAA GGAATGAGGG TCGTTGTGAA TG - #CACTCATA        3960                                                                           - GGAGCAATTC CTTCCATCAT GAATGTGCTA CTTGTGTGTC TTATATTCTG GC - #TGATATTC        4020                                                                           - AGCATCATGG GAGTAAATTT GTTTGCTGGC AAGTTCTATG AGTGTATTAA CA - #CCACAGAT        4080                                                                           - GGGTCACGGT TTCCTGCAAG TCAAGTTCCA AATCGTTCCG AATGTTTTGC CC - #TTATGAAT        4140                                                                           - GTTAGTCAAA ATGTGCGATG GAAAAACCTG AAAGTGAACT TTGATAATGT CG - #GACTTGGT        4200                                                                           - TACCTATCTC TGCTTCAAGT TGCAACTTTT AAGGGATGGA CGATTATTAT GT - #ATGCAGCA        4260                                                                           - GTGGATTCTG TTAATGTAGA CAAGCAGCCC AAATATGAAT ATAGCCTCTA CA - #TGTATATT        4320                                                                           - TATTTTGTCG TCTTTATCAT CTTTGGGTCA TTCTTCACTT TGAACTTGTT CA - #TTGGTGTC        4380                                                                           - ATCATAGATA ATTTCAACCA ACAGAAAAAG AAGCTTGGAG GTCAAGACAT CT - #TTATGACA        4440                                                                           - GAAGAACAGA AGAAATACTA TAATGCAATG AAAAAGCTGG GGTCCAAGAA GC - #CACAAAAG        4500                                                                           - CCAATTCCTC GACCAGGGAA CAAAATCCAA GGATGTATAT TTGACCTAGT GA - #CAAATCAA        4560                                                                           - GCCTTTGATA TTAGTATCAT GGTTCTTATC TGTCTCAACA TGGTAACCAT GA - #TGGTAGAA        4620                                                                           - AAGGAGGGTC AAAGTCAACA TATGACTGAA GTTTTATATT GGATAAATGT GG - #TTTTTATA        4680                                                                           - ATCCTTTTCA CTGGAGAATG TGTGCTAAAA CTGATCTCCC TCAGACACTA CT - #ACTTCACT        4740                                                                           - GTAGGATGGA ATATTTTTGA TTTTGTGGTT GTGATTATCT CCATTGTAGG TA - #TGTTTCTA        4800                                                                           - GCTGATTTGA TTGAAACGTA TTTTGTGTCC CCTACCCTGT TCCGAGTGAT CC - #GTCTTGCC        4860                                                                           - AGGATTGGCC GAATCCTACG TCTAGTCAAA GGAGCAAAGG GGATCCGCAC GC - #TGCTCTTT        4920                                                                           - GCTTTGATGA TGTCCCTTCC TGCGTTGTTT AACATCGGCC TCCTGCTCTT CC - #TGGTCATG        4980                                                                           - TTCATCTACG CCATCTTTGG AATGTCCAAC TTTGCCTATG TTAAAAAGGA AG - #ATGGAATT        5040                                                                           - AATGACATGT TCAATTTTGA GACCTTTGGC AACAGTATGA TTTGCCTGTT CC - #AAATTACA        5100                                                                           - ACCTCTGCTG GCTGGGATGG ATTGCTAGCA CCTATTCTTA ACAGTAAGCC AC - #CCGACTGT        5160                                                                           - GACCCAAAAA AAGTTCATCC TGGAAGTTCA GTTGAAGGAG ACTGTGGTAA CC - #CATCTGTT        5220                                                                           - GGAATATTCT ACTTTGTTAG TTATATCATC ATATCCTTCC TGGTTGTGGT GA - #ACATGTAC        5280                                                                           - ATTGCAGTCA TACTGGAGAA TTTTAGTGTT GCCACTGAAG AAAGTACTGA AC - #CTCTGAGT        5340                                                                           - GAGGATGACT TTGAGATGTT CTATGAGGTT TGGGAGAAGT TTGATCCCGA TG - #CGACCCAG        5400                                                                           - TTTATAGAGT TCTCTAAACT CTCTGATTTT GCAGCTGCCC TGGATCCTCC TC - #TTCTCATA        5460                                                                           - GCAAAACCCA ACAAAGTCCA GCTCATTGCC ATGGATCTGC CCATGGTTAG TG - #GTGACCGG        5520                                                                           - ATCCATTGTC TTGACATCTT ATTTGCTTTT ACAAAGCGTG TTTTGGGTGA GA - #GTGGGGAG        5580                                                                           - ATGGATTCTC TTCGTTCACA GATGGAAGAA AGGTTCATGT CTGCAAATCC TT - #CCAAAGTG        5640                                                                           - TCCTATGAAC CCATCACAAC CACACTAAAA CGGAAACAAG AGGATGTGTC TG - #CTACTGTC        5700                                                                           - ATTCAGCGTG CTTATAGACG TTACCGCTTA AGGCAAAATG TCAAAAATAT AT - #CAAGTATA        5760                                                                           - TACATAAAAG ATGGAGACAG AGATGATGAT TTACTCAATA AAAAAGATAT GG - #CTTTTGAT        5820                                                                           - AATGTTAATG AGAACTCAAG TCCAGAAAAA ACAGATGCCA CTTCATCCAC CA - #CCTCTCCA        5880                                                                           - CCTTCATATG ATAGTGTAAC AAAGCCAGAC AAAGAGAAAT ATGAACAAGA CA - #GAACAGAA        5940                                                                           - AAGGAAGACA AAGGGAAAGA CAGCAAGGAA AGCAAAAAAT AGAGCTTCAT TT - #TTGATATA        6000                                                                           - TTGTTTACAG CCTGTGAAAG TGATTTATTT GTGTTAATAA AACTCTTTTG AG - #GAAGTCTA        6060                                                                           - TGCCAAAATC CTTTTTATCA AAATATTCTC GAAGGCAGTG CAGTCACTAA CT - #CTGATTTC        6120                                                                           - CTAAGAAAGG TGGGCAGCAT TAGCAGATGG TTATTTTTGC ACTGATGATT CT - #TTAAGAAT        6180                                                                           - CGTAAGAGAA CTCTGTAGGA ATTATTGATT ATAGCATACA AAAGTGATTG AT - #TCAGTTTT        6240                                                                           - TTGGTTTTTA ATAAATCAGA AGACCATGTA GAAAACTTTT ACATCTGCCT TG - #TCATCTTT        6300                                                                           - TCACAGGATT GTAATTAGTC TTGTTTCCCA TGTAAATAAA CAACACACGC AT - #ACAGAAAA        6360                                                                           #     6371                                                                     - (2) INFORMATION FOR SEQ ID NO:14:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 6404 base                                                          (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: both                                                         (D) TOPOLOGY: both                                                   -     (ii) MOLECULE TYPE: DNA (genomic)                                        -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:                                 - CTCTTATGTG AGGAGCTGAA GAGGAATTAA AATATACAGG ATGAAAAGAT GG - #CAATGTTG          60                                                                           - CCTCCCCCAG GACCTCAGAG CTTTGTCCAT TTCACAAAAC AGTCTCTTGC CC - #TCATTGAA         120                                                                           - CAACGCATTG CTGAAAGAAA ATCAAAGGAA CCCAAAGAAG AAAAGAAAGA TG - #ATGATGAA         180                                                                           - GAAGCCCCAA AGCCAAGCAG TGACTTGGAA GCTGGCAAAC AACTGCCCTT CA - #TCTATGGG         240                                                                           - GACATTCCTC CCGGCATGGT GTCAGAGCCC CTGGAGGACT TGGACCCCTA CT - #ATGCAGAC         300                                                                           - AAAAAGACTT TCATAGTATT GAACAAAGGG AAAACAATCT TCCGTTTCAA TG - #CCACACCT         360                                                                           - GCTTTATATA TGCTTTCTCC TTTCAGTCCT CTAAGAAGAA TATCTATTAA GA - #TTTTAGTA         420                                                                           - CACTCCTTAT TCAGCATGCT CATCATGTGC ACTATTCTGA CAAACTGCAT AT - #TTATGACC         480                                                                           - ATGAATAACC CGCCGGACTG GACCAAAAAT GTCGAGTACA CTTTTACTGG AA - #TATATACT         540                                                                           - TTTGAATCAC TTGTAAAAAT CCTTGCAAGA GGCTTCTGTG TAGGAGAATT CA - #CTTTTCTT         600                                                                           - CGTGACCCGT GGAACTGGCT GGATTTTGTC GTCATTGTTT TTGCGTATTT AA - #CAGAATTT         660                                                                           - GTAAACCTAG GCAATGTTTC AGCTCTTCGA ACTTTCAGAG TATTGAGAGC TT - #TGAAAACT         720                                                                           - ATTTCTGTAA TCCCAGGCCT GAAGACAATT GTAGGGGCTT TGATCCAGTC AG - #TGAAGAAG         780                                                                           - CTTTCTGATG TCATGATCCT GACTGTGTTC TGTCTGAGTG TGTTTGCACT AA - #TTGGACTA         840                                                                           - CAGCTGTTCA TGGGAAACCT GAAGCATAAA TGTTTTCGAA ATTCACTTGA AA - #ATAATGAA         900                                                                           - ACATTAGAAA GCATAATGAA TACCCTAGAG AGTGAAGAAG ACTTTAGAAA AT - #ATTTTTAT         960                                                                           - TACTTGGAAG GATCCAAAGA TGCTCTCCTT TGTGGTTTCA GCACAGATTC AG - #GTCAGTGT        1020                                                                           - CCAGAGGGGT ACACCTGTGT GAAAATTGGC AGAAACCCTG ATTATGGCTA CA - #CGAGCTTT        1080                                                                           - GACACTTTCA GCTGGGCCTT CTTAGCCTTG TTTAGGCTAA TGACCCAAGA TT - #ACTGGGAA        1140                                                                           - AACCTTTACC AACAGACGCT GCGTGCTGCT GGCAAAACCT ACATGATCTT CT - #TTGTCGTA        1200                                                                           - GTGATTTTCC TGGGCTCCTT TTATCTAATA AACTTGATCC TGGCTGTGGT TG - #CCATGGCA        1260                                                                           - TATGAAGAAC AGAACCAGGC AAACATTGAA GAAGCTAAAC AGAAAGAATT AG - #AATTTCAA        1320                                                                           - CAGATGTTAG ACCGTCTTAA AAAAGAGCAA GAAGAAGCTG AGGCAATTGC AG - #CGGCAGCG        1380                                                                           - GCTGAATATA CAAGTATTAG GAGAAGCAGA ATTATGGGCC TCTCAGAGAG TT - #CTTCTGAA        1440                                                                           - ACATCCAAAC TGAGCTCTAA AAGTGCTAAA GAAAGAAGAA ACAGAAGAAA GA - #AAAAGAAT        1500                                                                           - CAAAAGAAGC TCTCCAGTGG AGAGGAAAAG GGAGATGCTG AGAAATTGTC GA - #AATCAGAA        1560                                                                           - TCAGAGGACA GCATCAGAAG AAAAAGTTTC CACCTTGGTG TCGAAGGGCA TA - #GGCGAGCA        1620                                                                           - CATGAAAAGA GGTTGTCTAC CCCCAATCAG TCACCACTCA GCATTCGTGG CT - #CCTTGTTT        1680                                                                           - TCTGCAAGGC GAAGCAGCAG AACAAGTCTT TTTAGTTTCA AAGGCAGAGG AA - #GAGATATA        1740                                                                           - GGATCTGAGA CTGAATTTGC CGATGATGAG CACAGCATTT TTGGAGACAA TG - #AGAGCAGA        1800                                                                           - AGGGGCTCAC TGTTTGTGCC CCACAGACCC CAGGAGCGAC GCAGCAGTAA CA - #TCAGCCAA        1860                                                                           - GCCAGTAGGT CCCCACCAAT GCTGCCGGTG AACGGGAAAA TGCACAGTGC TG - #TGGACTGC        1920                                                                           - AACGGTGTGG TCTCCCTGGT TGATGGACGC TCAGCCCTCA TGCTCCCCAA TG - #GACAGCTT        1980                                                                           - CTGCCAGAGG TGATAATAGA TAAGACAACT TCTGATGACA GCGGCACGAC CA - #ATCAAATA        2040                                                                           - CACAAGAAAA GGCGTTGTAG TTCCTATCTC CTTTCAGAGG ATATGCTGAA TG - #ATCCCAAC        2100                                                                           - CTCAGACAGA GAGCAATGAG TAGAGCAAGC ATATTAACAA ACACTGTGGA AG - #AACTTGAA        2160                                                                           - GAGTCCAGAC AAAAATGTCC ACCTTGGTGG TACAGATTTG CACACAAATT CT - #TGATCTGG        2220                                                                           - AATTGCTCTC CATATTGGAT AAAATTCAAA AAGTGTATCT ATTTTATTGT AA - #TGGATCCT        2280                                                                           - TTTGTAGATC TTGCAATTAC CATTTGCATA GTTTTAAACA CATTATTTAT GG - #CTATGGAA        2340                                                                           - CACCACCCAA TGACTGAGGA ATTCAAAAAT GTACTTGCTA TAGGAAATTT GG - #TCTTTACT        2400                                                                           - GGAATCTTTG CAGCTGAAAT GGTATTAAAA CTGATTGCCA TGGATCCATA TG - #AGTATTTC        2460                                                                           - CAAGTAGGCT GGAATATTTT TGACAGCCTT ATTGTGACTT TAAGTTTAGT GG - #AGCTCTTT        2520                                                                           - CTAGCAGATG TGGAAGGATT GTCAGTTCTG CGATCATTCA GACTGCTCCG AG - #TCTTCAAG        2580                                                                           - TTGGCAAAAT CCTGGCCAAC ATTGAACATG CTGATTAAGA TCATTGGTAA CT - #CAGTAGGG        2640                                                                           - GCTCTAGGTA ACCTCACCTT AGTGTTGGCC ATCATCGTCT TCATTTTTGC TG - #TGGTCGGC        2700                                                                           - ATGCAGCTCT TTGGTAAGAG CTACAAAGAA TGTGTCTGCA AGATCAATGA TG - #ACTGTACG        2760                                                                           - CTCCCACGGT GGCACATGAA CGACTTCTTC CACTCCTTCC TGATTGTGTT CC - #GCGTGCTG        2820                                                                           - TGTGGAGAGT GGATAGAGAC CATGTGGGAC TGTATGGAGG TCGCTGGTCA AG - #CTATGTGC        2880                                                                           - CTTATTGTTT ACATGATGGT CATGGTCATT GGAAACCTGG TGGTCCTAAA CC - #TATTTCTG        2940                                                                           - GCCTTATTAT TGAGCTCATT TAGTTCAGAC AATCTTACAG CAATTGAAGA AG - #ACCCTGAT        3000                                                                           - GCAAACAACC TCCAGATTGC AGTGACTAGA ATTAAAAAGG GAATAAATTA TG - #TGAAACAA        3060                                                                           - ACCTTACGTG AATTTATTCT AAAAGCATTT TCCAAAAAGC CAAAGATTTC CA - #GGGAGATA        3120                                                                           - AGACAAGCAG AAGATCTGAA TACTAAGAAG GAAAACTATA TTTCTAACCA TA - #CACTTGCT        3180                                                                           - GAAATGAGCA AAGGTCACAA TTTCCTCAAG GAAAAAGATA AAATCAGTGG TT - #TTGGAAGC        3240                                                                           - AGCGTGGACA AACACTTGAT GGAAGACAGT GATGGTCAAT CATTTATTCA CA - #ATCCCAGC        3300                                                                           - CTCACAGTGA CAGTGCCAAT TGCACCTGGG GAATCCGATT TGGAAAATAT GA - #ATGCTGAG        3360                                                                           - GAACTTAGCA GTGATTCGGA TAGTGAATAC AGCAAAGTGA GATTAAACCG GT - #CAAGCTCC        3420                                                                           - TCAGAGTGCA GCACAGTTGA TAACCCTTTG CCTGGAGAAG GAGAAGAAGC AG - #AGGCTGAA        3480                                                                           - CCTATGAATT CCGATGAGCC AGAGGCCTGT TTCACAGATG GTTGTGTACG GA - #GGTTCTCA        3540                                                                           - TGCTGCCAAG TTAACATAGA GTCAGGGAAA GGAAAAATCT GGTGGAACAT CA - #GGAAAACC        3600                                                                           - TGCTACAAGA TTGTTGAACA CAGTTGGTTT GAAAGCTTCA TTGTCCTCAT GA - #TCCTGCTC        3660                                                                           - AGCAGTGGTG CCCTGGCTTT TGAAGATATT TATATTGAAA GGAAAAAGAC CA - #TTAAGATT        3720                                                                           - ATCCTGGAGT ATGCAGACAA GATCTTCACT TACATCTTCA TTCTGGAAAT GC - #TTCTAAAA        3780                                                                           - TGGATAGCAT ATGGTTATAA AACATATTTC ACCAATGCCT GGTGTTGGCT GG - #ATTTCCTA        3840                                                                           - ATTGTTGATG TTTCTTTGGT TACTTTAGTG GCAAACACTC TTGGCTACTC AG - #ATCTTGGC        3900                                                                           - CCCATTAAAT CCCTTCGGAC ACTGAGAGCT TTAAGACCTC TAAGAGCCTT AT - #CTAGATTT        3960                                                                           - GAAGGAATGA GGGTCGTTGT GAATGCACTC ATAGGAGCAA TTCCTTCCAT CA - #TGAATGTG        4020                                                                           - CTACTTGTGT GTCTTATATT CTGGCTGATA TTCAGCATCA TGGGAGTAAA TT - #TGTTTGCT        4080                                                                           - GGCAAGTTCT ATGAGTGTAT TAACACCACA GATGGGTCAC GGTTTCCTGC AA - #GTCAAGTT        4140                                                                           - CCAAATCGTT CCGAATGTTT TGCCCTTATG AATGTTAGTC AAAATGTGCG AT - #GGAAAAAC        4200                                                                           - CTGAAAGTGA ACTTTGATAA TGTCGGACTT GGTTACCTAT CTCTGCTTCA AG - #TTGCAACT        4260                                                                           - TTTAAGGGAT GGACGATTAT TATGTATGCA GCAGTGGATT CTGTTAATGT AG - #ACAAGCAG        4320                                                                           - CCCAAATATG AATATAGCCT CTACATGTAT ATTTATTTTG TCGTCTTTAT CA - #TCTTTGGG        4380                                                                           - TCATTCTTCA CTTTGAACTT GTTCATTGGT GTCATCATAG ATAATTTCAA CC - #AACAGAAA        4440                                                                           - AAGAAGCTTG GAGGTCAAGA CATCTTTATG ACAGAAGAAC AGAAGAAATA CT - #ATAATGCA        4500                                                                           - ATGAAAAAGC TGGGGTCCAA GAAGCCACAA AAGCCAATTC CTCGACCAGG GA - #ACAAAATC        4560                                                                           - CAAGGATGTA TATTTGACCT AGTGACAAAT CAAGCCTTTG ATATTAGTAT CA - #TGGTTCTT        4620                                                                           - ATCTGTCTCA ACATGGTAAC CATGATGGTA GAAAAGGAGG GTCAAAGTCA AC - #ATATGACT        4680                                                                           - GAAGTTTTAT ATTGGATAAA TGTGGTTTTT ATAATCCTTT TCACTGGAGA AT - #GTGTGCTA        4740                                                                           - AAACTGATCT CCCTCAGACA CTACTACTTC ACTGTAGGAT GGAATATTTT TG - #ATTTTGTG        4800                                                                           - GTTGTGATTA TCTCCATTGT AGGTATGTTT CTAGCTGATT TGATTGAAAC GT - #ATTTTGTG        4860                                                                           - TCCCCTACCC TGTTCCGAGT GATCCGTCTT GCCAGGATTG GCCGAATCCT AC - #GTCTAGTC        4920                                                                           - AAAGGAGCAA AGGGGATCCG CACGCTGCTC TTTGCTTTGA TGATGTCCCT TC - #CTGCGTTG        4980                                                                           - TTTAACATCG GCCTCCTGCT CTTCCTGGTC ATGTTCATCT ACGCCATCTT TG - #GAATGTCC        5040                                                                           - AACTTTGCCT ATGTTAAAAA GGAAGATGGA ATTAATGACA TGTTCAATTT TG - #AGACCTTT        5100                                                                           - GGCAACAGTA TGATTTGCCT GTTCCAAATT ACAACCTCTG CTGGCTGGGA TG - #GATTGCTA        5160                                                                           - GCACCTATTC TTAACAGTAA GCCACCCGAC TGTGACCCAA AAAAAGTTCA TC - #CTGGAAGT        5220                                                                           - TCAGTTGAAG GAGACTGTGG TAACCCATCT GTTGGAATAT TCTACTTTGT TA - #GTTATATC        5280                                                                           - ATCATATCCT TCCTGGTTGT GGTGAACATG TACATTGCAG TCATACTGGA GA - #ATTTTAGT        5340                                                                           - GTTGCCACTG AAGAAAGTAC TGAACCTCTG AGTGAGGATG ACTTTGAGAT GT - #TCTATGAG        5400                                                                           - GTTTGGGAGA AGTTTGATCC CGATGCGACC CAGTTTATAG AGTTCTCTAA AC - #TCTCTGAT        5460                                                                           - TTTGCAGCTG CCCTGGATCC TCCTCTTCTC ATAGCAAAAC CCAACAAAGT CC - #AGCTCATT        5520                                                                           - GCCATGGATC TGCCCATGGT TAGTGGTGAC CGGATCCATT GTCTTGACAT CT - #TATTTGCT        5580                                                                           - TTTACAAAGC GTGTTTTGGG TGAGAGTGGG GAGATGGATT CTCTTCGTTC AC - #AGATGGAA        5640                                                                           - GAAAGGTTCA TGTCTGCAAA TCCTTCCAAA GTGTCCTATG AACCCATCAC AA - #CCACACTA        5700                                                                           - AAACGGAAAC AAGAGGATGT GTCTGCTACT GTCATTCAGC GTGCTTATAG AC - #GTTACCGC        5760                                                                           - TTAAGGCAAA ATGTCAAAAA TATATCAAGT ATATACATAA AAGATGGAGA CA - #GAGATGAT        5820                                                                           - GATTTACTCA ATAAAAAAGA TATGGCTTTT GATAATGTTA ATGAGAACTC AA - #GTCCAGAA        5880                                                                           - AAAACAGATG CCACTTCATC CACCACCTCT CCACCTTCAT ATGATAGTGT AA - #CAAAGCCA        5940                                                                           - GACAAAGAGA AATATGAACA AGACAGAACA GAAAAGGAAG ACAAAGGGAA AG - #ACAGCAAG        6000                                                                           - GAAAGCAAAA AATAGAGCTT CATTTTTGAT ATATTGTTTA CAGCCTGTGA AA - #GTGATTTA        6060                                                                           - TTTGTGTTAA TAAAACTCTT TTGAGGAAGT CTATGCCAAA ATCCTTTTTA TC - #AAAATATT        6120                                                                           - CTCGAAGGCA GTGCAGTCAC TAACTCTGAT TTCCTAAGAA AGGTGGGCAG CA - #TTAGCAGA        6180                                                                           - TGGTTATTTT TGCACTGATG ATTCTTTAAG AATCGTAAGA GAACTCTGTA GG - #AATTATTG        6240                                                                           - ATTATAGCAT ACAAAAGTGA TTGATTCAGT TTTTTGGTTT TTAATAAATC AG - #AAGACCAT        6300                                                                           - GTAGAAAACT TTTACATCTG CCTTGTCATC TTTTCACAGG ATTGTAATTA GT - #CTTGTTTC        6360                                                                           #                 640 - #4CATACAGA AAAAAAAAAA AAAA                             - (2) INFORMATION FOR SEQ ID NO:15:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 1835 amino                                                         (B) TYPE: amino acid                                                           (C) STRANDEDNESS: Not R - #elevant                                             (D) TOPOLOGY: Not Relev - #ant                                       -     (ii) MOLECULE TYPE: protein                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:                                 - Met Ala Met Leu Pro Pro Pro Gly Pro Gln Se - #r Phe Val His Phe Thr          #                15                                                            - Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Il - #e Glu Lys Lys Glu Lys          #            30                                                                - Glu Lys Lys Asp Asp Glu Glu Pro Lys Pro Se - #r Ser Asp Leu Glu Ala          #        45                                                                    - Gly Lys Gln Leu Pro Phe Ile Tyr Gly Asp Il - #e Pro Pro Gly Met Val          #    60                                                                        - Ser Glu Pro Leu Glu Asp Leu Asp Pro Tyr Ty - #r Ala Asp Lys Lys Thr          #80                                                                            - Phe Ile Val Leu Asn Lys Gly Lys Ile Phe Ar - #g Phe Asn Ala Thr Pro          #                95                                                            - Ala Leu Tyr Met Leu Ser Pro Phe Ser Pro Le - #u Arg Arg Ile Ser Ile          #           110                                                                - Lys Ile Leu Val His Ser Leu Phe Ser Met Le - #u Ile Met Cys Thr Ile          #       125                                                                    - Leu Thr Asn Cys Ile Phe Met Thr Asn Pro Pr - #o Trp Thr Lys Asn Val          #   140                                                                        - Tyr Thr Phe Thr Gly Ile Tyr Thr Phe Glu Se - #r Leu Lys Ile Leu Ala          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Arg Gly Phe Cys Val Gly Glu Phe Thr Phe Le - #u Arg Asp Pro Trp Asn          #               175                                                            - Trp Leu Asp Phe Val Val Ile Val Phe Ala Ty - #r Leu Thr Glu Phe Val          #           190                                                                - Asn Leu Gly Asn Val Ser Ala Leu Arg Thr Ph - #e Arg Val Leu Arg Ala          #       205                                                                    - Leu Lys Thr Ile Ser Val Ile Pro Gly Leu Ly - #s Thr Ile Val Gly Ala          #   220                                                                        - Leu Ile Gln Ser Val Lys Lys Leu Ser Asp Va - #l Met Ile Leu Thr Val          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Phe Cys Leu Ser Val Phe Ala Leu Ile Gly Le - #u Gln Leu Phe Met Gly          #               255                                                            - Asn Leu Lys His Lys Cys Phe Arg Leu Glu As - #n Glu Thr Leu Glu Ser          #           270                                                                - Ile Met Asn Thr Glu Ser Glu Glu Lys Tyr Ph - #e Tyr Tyr Leu Glu Gly          #       285                                                                    - Ser Lys Asp Ala Leu Leu Cys Gly Phe Ser Th - #r Asp Ser Gly Gln Cys          #   300                                                                        - Pro Glu Gly Tyr Cys Val Lys Gly Arg Asn Pr - #o Asp Tyr Gly Tyr Thr          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Ser Phe Asp Thr Phe Ser Trp Ala Phe Leu Al - #a Leu Phe Arg Leu Met          #               335                                                            - Thr Gln Asp Tyr Trp Glu Asn Leu Tyr Gln Gl - #n Thr Leu Arg Ala Ala          #           350                                                                - Gly Lys Thr Tyr Met Ile Phe Phe Val Val Va - #l Ile Phe Leu Gly Ser          #       365                                                                    - Phe Tyr Leu Ile Asn Leu Ile Leu Ala Val Va - #l Ala Met Ala Tyr Glu          #   380                                                                        - Glu Gln Asn Gln Ala Asn Ile Glu Glu Ala Ly - #s Gln Lys Glu Leu Glu          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Phe Gln Gln Met Leu Asp Arg Leu Lys Lys Gl - #u Gln Glu Glu Ala Glu          #               415                                                            - Ala Ile Ala Ala Ala Ala Ala Glu Thr Ser Il - #e Arg Ser Arg Ile Met          #           430                                                                - Gly Leu Ser Glu Ser Ser Ser Glu Thr Ser Le - #u Ser Ser Lys Ser Ala          #       445                                                                    - Lys Glu Arg Arg Asn Arg Arg Lys Lys Lys Gl - #n Lys Lys Ser Ser Gly          #   460                                                                        - Glu Glu Lys Gly Asp Glu Lys Leu Ser Lys Se - #r Ser Glu Ser Ile Arg          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Lys Ser Phe His Leu Gly Val Glu Gly His Ar - #g Glu Lys Arg Leu Ser          #               495                                                            - Thr Pro Asn Gln Ser Pro Leu Ser Ile Arg Gl - #y Ser Leu Phe Ser Ala          #           510                                                                - Arg Arg Ser Ser Arg Thr Ser Leu Phe Ser Ph - #e Lys Gly Arg Gly Arg          #       525                                                                    - Asp Gly Ser Glu Thr Glu Phe Ala Asp Asp Gl - #u His Ser Ile Phe Gly          #   540                                                                        - Asp Asn Glu Ser Arg Arg Gly Ser Leu Phe Va - #l Pro His Arg Pro Glu          545                 5 - #50                 5 - #55                 5 -        #60                                                                            - Arg Arg Ser Ser Asn Ile Ser Gln Ala Ser Ar - #g Ser Pro Pro Leu Pro          #               575                                                            - Val Asn Gly Lys Met His Ser Ala Val Asp Cy - #s Asn Gly Val Val Ser          #           590                                                                - Leu Val Asp Gly Ser Ala Leu Met Leu Pro As - #n Gly Gln Leu Leu Pro          #       605                                                                    - Glu Gly Thr Thr Asn Gln Lys Lys Arg Ser Se - #r Tyr Leu Ser Glu Asp          #   620                                                                        - Met Leu Asn Asp Pro Leu Arg Gln Arg Ala Me - #t Ser Arg Ala Ser Ile          625                 6 - #30                 6 - #35                 6 -        #40                                                                            - Leu Thr Asn Thr Val Glu Glu Leu Glu Glu Se - #r Arg Gln Lys Cys Tyr          #               655                                                            - Arg Phe Ala His Phe Leu Ile Trp Asn Cys Se - #r Pro Tyr Trp Ile Lys          #           670                                                                - Phe Lys Lys Ile Tyr Phe Ile Val Met Asp Pr - #o Phe Val Asp Leu Ala          #       685                                                                    - Ile Thr Ile Cys Ile Val Leu Asn Thr Leu Ph - #e Met Ala Met Glu His          #   700                                                                        - His Pro Met Thr Glu Glu Phe Lys Asn Val Le - #u Ala Gly Asn Leu Phe          705                 7 - #10                 7 - #15                 7 -        #20                                                                            - Thr Gly Ile Phe Ala Ala Glu Met Val Leu Ly - #s Leu Ile Ala Met Asp          #               735                                                            - Pro Tyr Glu Tyr Phe Gln Val Gly Trp Asn Il - #e Phe Asp Ser Leu Ile          #           750                                                                - Val Thr Leu Ser Leu Glu Leu Phe Leu Ala As - #p Val Glu Gly Leu Ser          #       765                                                                    - Val Leu Arg Ser Phe Arg Leu Leu Arg Val Ph - #e Lys Leu Ala Lys Ser          #   780                                                                        - Trp Pro Thr Leu Asn Met Leu Ile Lys Ile Il - #e Gly Asn Ser Val Gly          785                 7 - #90                 7 - #95                 8 -        #00                                                                            - Ala Leu Gly Asn Leu Thr Leu Val Leu Ala Il - #e Ile Val Phe Ile Phe          #               815                                                            - Ala Val Val Gly Met Gln Leu Phe Gly Lys Se - #r Tyr Lys Glu Cys Val          #           830                                                                - Cys Lys Ile Asn Asp Cys Leu Pro Arg Trp Hi - #s Met Asn Asp Phe Phe          #       845                                                                    - His Ser Phe Leu Ile Val Phe Arg Val Leu Cy - #s Gly Glu Trp Ile Glu          #   860                                                                        - Thr Met Trp Asp Cys Met Glu Val Ala Gly Gl - #n Met Cys Leu Ile Val          865                 8 - #70                 8 - #75                 8 -        #80                                                                            - Tyr Met Met Val Met Val Ile Gly Asn Leu Va - #l Val Leu Asn Leu Phe          #               895                                                            - Leu Ala Leu Leu Leu Ser Ser Phe Ser Ser As - #p Asn Leu Thr Ala Ile          #           910                                                                - Glu Glu Asp Asp Ala Asn Asn Leu Gln Ile Al - #a Val Arg Ile Lys Gly          #       925                                                                    - Ile Asn Tyr Val Lys Gln Thr Leu Arg Glu Ph - #e Ile Leu Lys Phe Ser          #   940                                                                        - Lys Lys Pro Lys Ser Asp Asn Lys Lys Glu As - #n Tyr Ile Ser Asn Thr          945                 9 - #50                 9 - #55                 9 -        #60                                                                            - Leu Ala Glu Met Ser Lys His Asn Phe Leu Ly - #s Glu Lys Asp Ile Ser          #               975                                                            - Gly Gly Ser Ser Asp Lys Met Asp Gln Ser Ph - #e Ile His Asn Pro Ser          #           990                                                                - Leu Thr Val Thr Val Pro Ile Ala Pro Gly Gl - #u Ser Asp Leu Glu Met          #      10050                                                                   - Asn Glu Glu Leu Ser Ser Asp Ser Asp Ser Ty - #r Ser Lys Asn Arg Ser          #  10205                                                                       - Ser Ser Ser Glu Cys Ser Thr Val Asp Asn Pr - #o Leu Pro Gly Glu Gly          #               10401030 - #                1035                               - Glu Glu Ala Glu Ala Glu Pro Asn Asp Glu Pr - #o Glu Ala Cys Phe Thr          #              10550                                                           - Asp Gly Cys Val Arg Arg Phe Cys Cys Gln Va - #l Asn Ser Gly Lys Gly          #          10705                                                               - Lys Trp Trp Ile Arg Lys Thr Cys Tyr Ile Va - #l Glu His Ser Trp Phe          #      10850                                                                   - Glu Ser Phe Ile Val Leu Met Ile Leu Leu Se - #r Ser Gly Ala Leu Ala          #  11005                                                                       - Phe Glu Asp Ile Tyr Ile Glu Lys Lys Thr Il - #e Lys Ile Ile Leu Glu          #               11201110 - #                1115                               - Tyr Ala Asp Lys Ile Phe Thr Tyr Ile Phe Il - #e Leu Glu Met Leu Leu          #              11350                                                           - Lys Trp Ala Tyr Gly Tyr Lys Thr Tyr Phe Th - #r Asn Ala Trp Cys Trp          #          11505                                                               - Leu Asp Phe Leu Ile Val Asp Val Ser Leu Va - #l Thr Leu Val Ala Asn          #      11650                                                                   - Thr Leu Gly Tyr Ser Asp Leu Gly Pro Ile Ly - #s Ser Leu Arg Thr Leu          #  11805                                                                       - Arg Ala Leu Arg Pro Leu Arg Ala Leu Ser Ar - #g Phe Glu Gly Met Arg          #               12001190 - #                1195                               - Val Val Val Asn Ala Leu Ile Gly Ala Ile Pr - #o Ser Ile Met Asn Val          #              12150                                                           - Leu Leu Val Cys Leu Ile Phe Trp Leu Ile Ph - #e Ser Ile Met Gly Val          #          12305                                                               - Asn Leu Phe Ala Gly Lys Phe Tyr Glu Cys As - #n Thr Thr Asp Gly Ser          #      12450                                                                   - Arg Phe Pro Ser Gln Val Asn Arg Ser Glu Cy - #s Phe Ala Leu Met Asn          #  12605                                                                       - Val Ser Asn Val Arg Trp Lys Asn Leu Lys Va - #l Asn Phe Asp Asn Val          #               12801270 - #                1275                               - Gly Leu Gly Tyr Leu Ser Leu Leu Gln Val Al - #a Thr Phe Lys Gly Trp          #              12950                                                           - Ile Met Tyr Ala Ala Val Asp Ser Val Asn Va - #l Gln Pro Lys Tyr Glu          #          13105                                                               - Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Val Ph - #e Ile Ile Phe Gly Ser          #      13250                                                                   - Phe Phe Thr Leu Asn Leu Phe Ile Gly Val Il - #e Ile Asp Asn Phe Asn          #  13405                                                                       - Gln Gln Lys Lys Lys Leu Gly Gly Gln Asp Il - #e Phe Met Thr Glu Glu          #               13601350 - #                1355                               - Gln Lys Lys Tyr Tyr Asn Ala Met Lys Lys Le - #u Gly Ser Lys Lys Pro          #              13750                                                           - Gln Lys Pro Ile Pro Arg Pro Gly Asn Lys Gl - #n Gly Cys Ile Phe Asp          #          13905                                                               - Leu Thr Asn Gln Ala Phe Asp Ile Ile Met Va - #l Leu Ile Cys Leu Asn          #      14050                                                                   - Met Val Thr Met Met Val Glu Lys Glu Gly Gl - #n Met Val Leu Trp Ile          #  14205                                                                       - Asn Val Phe Ile Ile Leu Phe Thr Gly Glu Cy - #s Val Leu Lys Leu Ile          #               14401430 - #                1435                               - Ser Leu Arg His Tyr Tyr Phe Thr Val Gly Tr - #p Asn Ile Phe Val Val          #              14550                                                           - Val Ile Ser Ile Val Gly Met Phe Leu Ala Il - #e Glu Tyr Phe Val Ser          #          14705                                                               - Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Ar - #g Ile Gly Arg Ile Leu          #      14850                                                                   - Arg Leu Lys Gly Ala Lys Gly Ile Arg Thr Le - #u Leu Phe Ala Leu Met          #  15005                                                                       - Met Ser Leu Pro Ala Leu Phe Asn Ile Gly Le - #u Leu Leu Phe Leu Val          #               15201510 - #                1515                               - Met Phe Ile Tyr Ala Ile Phe Gly Met Ser As - #n Phe Ala Tyr Val Lys          #              15350                                                           - Lys Glu Gly Ile Asn Asp Met Phe Asn Phe Gl - #u Thr Phe Gly Asn Ser          #          15505                                                               - Met Ile Cys Leu Phe Gln Ile Thr Thr Ser Al - #a Gly Trp Asp Gly Leu          #      15650                                                                   - Leu Ala Pro Ile Leu Asn Ser Pro Pro Asp Cy - #s Asp Pro Lys Lys Val          #  15805                                                                       - His Pro Gly Ser Ser Val Glu Gly Asp Cys Gl - #y Asn Pro Ser Val Gly          #               16001590 - #                1595                               - Ile Phe Tyr Phe Val Ser Tyr Ile Ile Ile Se - #r Phe Leu Val Val Val          #              16150                                                           - Asn Met Tyr Ile Ala Val Ile Leu Glu Asn Ph - #e Ser Val Ala Thr Glu          #          16305                                                               - Glu Ser Thr Glu Pro Leu Ser Glu Asp Asp Ph - #e Glu Met Phe Tyr Glu          #      16450                                                                   - Val Trp Glu Lys Phe Asp Pro Asp Ala Thr Gl - #n Phe Ile Glu Phe Lys          #  16605                                                                       - Leu Ser Asp Phe Ala Ala Ala Leu Asp Pro Pr - #o Leu Leu Ile Ala Lys          #               16801670 - #                1675                               - Pro Asn Lys Val Gln Leu Ile Ala Met Asp Le - #u Pro Met Val Ser Gly          #              16950                                                           - Asp Arg Ile His Cys Leu Asp Ile Leu Phe Al - #a Phe Thr Lys Arg Val          #          17105                                                               - Leu Gly Glu Gly Glu Met Asp Ser Leu Arg Se - #r Gln Met Glu Glu Arg          #      17250                                                                   - Phe Met Ser Ala Asn Pro Ser Lys Val Ser Ty - #r Glu Pro Ile Thr Thr          #  17405                                                                       - Thr Leu Lys Arg Lys Gln Glu Val Ser Ala Th - #r Ile Gln Arg Ala Tyr          #               17601750 - #                1755                               - Arg Arg Tyr Arg Leu Arg Gln Val Lys Asn Il - #e Ser Ser Ile Tyr Ile          #              17750                                                           - Lys Asp Gly Asp Arg Asp Asp Asp Leu Asn Ly - #s Asp Phe Asp Asn Val          #          17905                                                               - Asn Glu Asn Ser Ser Pro Glu Lys Thr Asp Th - #r Ser Thr Ser Pro Pro          #      18050                                                                   - Ser Tyr Asp Ser Val Thr Lys Pro Asp Glu Ly - #s Tyr Glu Asp Thr Glu          #  18205                                                                       - Lys Glu Asp Lys Lys Asp Ser Lys Glu Ser Ly - #s                              1825                1830 - #                1835                               - (2) INFORMATION FOR SEQ ID NO:16:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 1969 amino                                                         (B) TYPE: amino acid                                                           (C) STRANDEDNESS: Not R - #elevant                                             (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: protein                                              -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:                                 - Met Ala Met Leu Pro Pro Pro Gly Pro Gln Se - #r Phe Val His Phe Thr          #                15                                                            - Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Il - #e Ala Glu Arg Lys Ser          #            30                                                                - Lys Glu Pro Lys Glu Glu Lys Lys Asp Asp As - #p Glu Glu Ala Pro Lys          #        45                                                                    - Pro Ser Ser Asp Leu Glu Ala Gly Lys Gln Le - #u Pro Phe Ile Tyr Gly          #    60                                                                        - Asp Ile Pro Pro Gly Met Val Ser Glu Pro Le - #u Glu Asp Leu Asp Pro          #80                                                                            - Tyr Tyr Ala Asp Lys Lys Thr Phe Ile Val Le - #u Asn Lys Gly Lys Ala          #                95                                                            - Ile Phe Arg Phe Asn Ala Thr Pro Ala Leu Ty - #r Met Leu Ser Pro Phe          #           110                                                                - Ser Pro Leu Arg Arg Ile Ser Ile Lys Ile Le - #u Val His Ser Leu Phe          #       125                                                                    - Ser Met Leu Ile Met Cys Thr Ile Leu Thr As - #n Cys Ile Phe Met Thr          #   140                                                                        - Met Asn Asn Pro Pro Asp Trp Thr Lys Asn Va - #l Gly Tyr Thr Phe Thr          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Gly Ile Tyr Thr Phe Glu Ser Leu Val Lys Il - #e Leu Ala Arg Gly Phe          #               175                                                            - Cys Val Gly Glu Phe Thr Phe Leu Arg Asp Pr - #o Trp Asn Trp Leu Asp          #           190                                                                - Phe Val Val Ile Val Phe Ala Tyr Leu Thr Gl - #u Phe Val Asn Leu Gly          #       205                                                                    - Asn Val Ser Ala Leu Arg Thr Phe Arg Val Le - #u Arg Ala Leu Lys Thr          #   220                                                                        - Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Va - #l Gly Ala Leu Ile Gln          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Ser Val Lys Lys Leu Ser Asp Val Met Ile Le - #u Thr Val Phe Cys Leu          #               255                                                            - Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Ph - #e Met Gly Asn Leu Lys          #           270                                                                - His Lys Cys Phe Arg Asn Ser Leu Glu Asn As - #n Glu Thr Leu Glu Ser          #       285                                                                    - Ile Met Asn Thr Leu Glu Ser Glu Glu Asp Ph - #e Arg Lys Tyr Phe Tyr          #   300                                                                        - Tyr Leu Glu Gly Ser Lys Asp Ala Leu Leu Cy - #s Gly Phe Ser Thr Asp          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Ser Gly Gln Cys Pro Glu Gly Tyr Thr Cys Va - #l Lys Ile Gly Arg Asn          #               335                                                            - Pro Asp Tyr Gly Tyr Thr Ser Phe Asp Thr Ph - #e Ser Trp Ala Phe Leu          #           350                                                                - Ala Leu Phe Arg Leu Met Thr Gln Asp Tyr Tr - #p Glu Asn Leu Tyr Gln          #       365                                                                    - Gln Thr Leu Arg Ala Ala Gly Lys Thr Tyr Me - #t Ile Phe Phe Val Val          #   380                                                                        - Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile As - #n Leu Ile Leu Ala Val          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Al - #a Asn Ile Glu Glu Ala          #               415                                                            - Lys Gln Lys Glu Leu Glu Phe Gln Gln Met Le - #u Asp Arg Leu Lys Lys          #           430                                                                - Glu Gln Glu Glu Ala Glu Ala Ile Ala Ala Al - #a Ala Ala Glu Tyr Thr          #       445                                                                    - Ser Ile Arg Arg Ser Arg Ile Met Gly Leu Se - #r Glu Ser Ser Ser Glu          #   460                                                                        - Thr Ser Lys Leu Ser Ser Lys Ser Ala Lys Gl - #u Arg Arg Asn Arg Arg          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Lys Lys Lys Asn Gln Lys Lys Leu Ser Ser Gl - #y Glu Glu Lys Gly Asp          #               495                                                            - Ala Glu Lys Leu Ser Lys Ser Glu Ser Glu As - #p Ser Ile Arg Arg Lys          #           510                                                                - Ser Phe His Leu Gly Val Glu Gly His Arg Ar - #g Ala His Glu Lys Arg          #       525                                                                    - Leu Ser Thr Pro Asn Gln Ser Pro Leu Ser Il - #e Arg Gly Ser Leu Phe          #   540                                                                        - Ser Ala Arg Arg Ser Ser Arg Thr Ser Leu Ph - #e Ser Phe Lys Gly Arg          545                 5 - #50                 5 - #55                 5 -        #60                                                                            - Gly Arg Asp Xaa Gly Ser Glu Thr Glu Phe Al - #a Asp Asp Glu His Ser          #               575                                                            - Ile Phe Gly Asp Asn Glu Ser Arg Arg Gly Se - #r Leu Phe Val Pro His          #           590                                                                - Arg Pro Xaa Glu Arg Arg Ser Ser Asn Ile Se - #r Gln Ala Ser Arg Ser          #       605                                                                    - Pro Pro Met Leu Pro Val Asn Gly Lys Met Hi - #s Ser Ala Val Asp Cys          #   620                                                                        - Asn Gly Val Val Ser Leu Val Asp Gly Xaa Se - #r Ala Leu Met Leu Pro          625                 6 - #30                 6 - #35                 6 -        #40                                                                            - Asn Gly Gln Leu Leu Pro Glu Gly Thr Thr As - #n Gln Ile His Lys Lys          #               655                                                            - Arg Arg Cys Ser Ser Tyr Leu Leu Ser Glu As - #p Met Leu Asn Asp Pro          #           670                                                                - Asn Leu Arg Gln Arg Ala Met Ser Arg Ala Se - #r Ile Leu Thr Asn Thr          #       685                                                                    - Val Glu Glu Leu Glu Glu Ser Arg Gln Lys Cy - #s Pro Pro Trp Trp Tyr          #   700                                                                        - Arg Phe Ala His Lys Phe Leu Ile Trp Asn Cy - #s Ser Pro Tyr Trp Ile          705                 7 - #10                 7 - #15                 7 -        #20                                                                            - Lys Phe Lys Lys Cys Ile Tyr Phe Ile Val Me - #t Asp Pro Phe Val Asp          #               735                                                            - Leu Ala Ile Thr Ile Cys Ile Val Leu Asn Th - #r Leu Phe Met Ala Met          #           750                                                                - Glu His His Pro Met Thr Glu Glu Phe Lys As - #n Val Leu Ala Ile Gly          #       765                                                                    - Asn Leu Val Phe Thr Gly Ile Phe Ala Ala Gl - #u Met Val Leu Lys Leu          #   780                                                                        - Ile Ala Met Asp Pro Tyr Glu Tyr Phe Gln Va - #l Gly Trp Asn Ile Phe          785                 7 - #90                 7 - #95                 8 -        #00                                                                            - Asp Ser Leu Ile Val Thr Leu Ser Leu Val Gl - #u Leu Phe Leu Ala Asp          #               815                                                            - Val Glu Gly Leu Ser Val Leu Arg Ser Phe Ar - #g Leu Leu Arg Val Phe          #           830                                                                - Lys Leu Ala Lys Ser Trp Pro Thr Leu Asn Me - #t Leu Ile Lys Ile Ile          #       845                                                                    - Gly Asn Ser Val Gly Ala Leu Gly Asn Leu Th - #r Leu Val Leu Ala Ile          #   860                                                                        - Ile Val Phe Ile Phe Ala Val Val Gly Met Gl - #n Leu Phe Gly Lys Ser          865                 8 - #70                 8 - #75                 8 -        #80                                                                            - Tyr Lys Glu Cys Val Cys Lys Ile Asn Asp As - #p Cys Thr Leu Pro Arg          #               895                                                            - Trp His Met Asn Asp Phe Phe His Ser Phe Le - #u Ile Val Phe Arg Val          #           910                                                                - Leu Cys Gly Glu Trp Ile Glu Thr Met Trp As - #p Cys Met Glu Val Ala          #       925                                                                    - Gly Gln Ala Met Cys Leu Ile Val Tyr Met Me - #t Val Met Val Ile Gly          #   940                                                                        - Asn Leu Val Val Leu Asn Leu Phe Leu Ala Le - #u Leu Leu Ser Ser Phe          945                 9 - #50                 9 - #55                 9 -        #60                                                                            - Ser Ser Asp Asn Leu Thr Ala Ile Glu Glu As - #p Pro Asp Ala Asn Asn          #               975                                                            - Leu Gln Ile Ala Val Thr Arg Ile Lys Lys Gl - #y Ile Asn Tyr Val Lys          #           990                                                                - Gln Thr Leu Arg Glu Phe Ile Leu Lys Ala Ph - #e Ser Lys Lys Pro Lys          #      10050                                                                   - Ile Ser Arg Glu Ile Arg Gln Ala Glu Asp Le - #u Asn Thr Lys Lys Glu          #  10205                                                                       - Asn Tyr Ile Ser Asn Met Thr Leu Ala Glu Me - #t Ser Lys Gly His Asn          #               10401030 - #                1035                               - Phe Leu Lys Glu Lys Asp Lys Ile Ser Gly Ph - #e Gly Ser Ser Xaa Asp          #              10550                                                           - Lys His Leu Met Glu Asp Ser Asp Gly Gln Se - #r Phe Ile His Asn Pro          #          10705                                                               - Ser Leu Thr Val Thr Val Pro Ile Ala Pro Gl - #y Glu Ser Asp Leu Glu          #      10850                                                                   - Met Asn Glu Glu Leu Ser Ser Asp Ser Asp Se - #r Tyr Ser Lys Asn Arg          #  11005                                                                       - Ser Ser Ser Ser Glu Cys Ser Thr Val Asp As - #n Pro Leu Pro Gly Glu          #               11201110 - #                1115                               - Gly Glu Glu Ala Glu Ala Glu Pro Asn Asp Gl - #u Pro Glu Ala Cys Phe          #              11350                                                           - Thr Asp Gly Cys Val Arg Arg Phe Ser Cys Cy - #s Gln Val Asn Ile Glu          #          11505                                                               - Ser Gly Lys Gly Lys Ile Trp Trp Asn Ile Ar - #g Lys Thr Cys Tyr Lys          #      11650                                                                   - Ile Val Glu His Ser Trp Phe Glu Ser Phe Il - #e Val Leu Met Ile Leu          #  11805                                                                       - Leu Ser Ser Gly Ala Leu Ala Phe Glu Asp Il - #e Tyr Ile Glu Arg Lys          #               12001190 - #                1195                               - Lys Thr Ile Lys Ile Ile Leu Glu Tyr Ala As - #p Lys Ile Phe Thr Tyr          #              12150                                                           - Ile Phe Ile Leu Glu Met Leu Leu Lys Trp Il - #e Ala Tyr Gly Tyr Lys          #          12305                                                               - Thr Tyr Phe Thr Asn Ala Trp Cys Trp Leu As - #p Phe Leu Ile Val Asp          #      12450                                                                   - Val Ser Leu Val Thr Leu Val Ala Asn Thr Le - #u Gly Tyr Ser Asp Leu          #  12605                                                                       - Gly Pro Ile Lys Ser Leu Arg Thr Leu Arg Al - #a Leu Arg Pro Leu Arg          #               12801270 - #                1275                               - Ala Leu Ser Arg Phe Glu Gly Met Arg Val Va - #l Val Asn Ala Leu Ile          #              12950                                                           - Gly Ala Ile Pro Ser Ile Met Asn Val Leu Le - #u Val Cys Leu Ile Phe          #          13105                                                               - Trp Leu Ile Phe Ser Ile Met Gly Val Asn Le - #u Phe Ala Gly Lys Phe          #      13250                                                                   - Tyr Glu Cys Ile Asn Thr Thr Asp Gly Ser Ar - #g Phe Pro Ala Ser Gln          #  13405                                                                       - Val Pro Asn Arg Ser Glu Cys Phe Ala Leu Me - #t Asn Val Ser Gln Asn          #               13601350 - #                1355                               - Val Arg Trp Lys Asn Leu Lys Val Asn Phe As - #p Asn Val Gly Leu Gly          #              13750                                                           - Tyr Leu Ser Leu Leu Gln Val Ala Thr Phe Ly - #s Gly Trp Thr Ile Ile          #          13905                                                               - Met Tyr Ala Ala Val Asp Ser Val Asn Val As - #p Lys Gln Pro Lys Tyr          #      14050                                                                   - Glu Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Va - #l Val Phe Ile Ile Phe          #  14205                                                                       - Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gl - #y Val Ile Ile Asp Asn          #               14401430 - #                1435                               - Phe Asn Gln Gln Lys Lys Lys Leu Gly Gly Gl - #n Asp Ile Phe Met Thr          #              14550                                                           - Glu Glu Gln Lys Lys Tyr Tyr Asn Ala Met Ly - #s Lys Leu Gly Ser Lys          #          14705                                                               - Lys Pro Gln Lys Pro Ile Pro Arg Pro Gly As - #n Lys Ile Gln Gly Cys          #      14850                                                                   - Ile Phe Asp Leu Val Thr Asn Gln Ala Phe As - #p Ile Ser Ile Met Val          #  15005                                                                       - Leu Ile Cys Leu Asn Met Val Thr Met Met Va - #l Glu Lys Glu Gly Gln          #               15201510 - #                1515                               - Ser Gln His Met Thr Glu Val Leu Tyr Trp Il - #e Asn Val Val Phe Ile          #              15350                                                           - Ile Leu Phe Thr Gly Glu Cys Val Leu Lys Le - #u Ile Ser Leu Arg His          #          15505                                                               - Tyr Tyr Phe Thr Val Gly Trp Asn Ile Phe As - #p Phe Val Val Val Ile          #      15650                                                                   - Ile Ser Ile Val Gly Met Phe Leu Ala Asp Le - #u Ile Glu Thr Tyr Phe          #  15805                                                                       - Val Ser Pro Thr Leu Phe Arg Val Ile Arg Le - #u Ala Arg Ile Gly Arg          #               16001590 - #                1595                               - Ile Leu Arg Leu Val Lys Gly Ala Lys Gly Il - #e Arg Thr Leu Leu Phe          #              16150                                                           - Ala Leu Met Met Ser Leu Pro Ala Leu Phe As - #n Ile Gly Leu Leu Leu          #          16305                                                               - Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gl - #y Met Ser Asn Phe Ala          #      16450                                                                   - Tyr Val Lys Lys Glu Asp Gly Ile Asn Asp Me - #t Phe Asn Phe Glu Thr          #  16605                                                                       - Phe Gly Asn Ser Met Ile Cys Leu Phe Gln Il - #e Thr Thr Ser Ala Gly          #               16801670 - #                1675                               - Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn Se - #r Lys Pro Pro Asp Cys          #              16950                                                           - Asp Pro Lys Lys Val His Pro Gly Ser Ser Va - #l Glu Gly Asp Cys Gly          #          17105                                                               - Asn Pro Ser Val Gly Ile Phe Tyr Phe Val Se - #r Tyr Ile Ile Ile Ser          #      17250                                                                   - Phe Leu Val Val Val Asn Met Tyr Ile Ala Va - #l Ile Leu Glu Asn Phe          #  17405                                                                       - Ser Val Ala Thr Glu Glu Ser Thr Glu Pro Le - #u Ser Glu Asp Asp Phe          #               17601750 - #                1755                               - Glu Met Phe Tyr Glu Val Trp Glu Lys Phe As - #p Pro Asp Ala Thr Gln          #              17750                                                           - Phe Ile Glu Phe Ser Lys Leu Ser Asp Phe Al - #a Ala Ala Leu Asp Pro          #          17905                                                               - Pro Leu Leu Ile Ala Lys Pro Asn Lys Val Gl - #n Leu Ile Ala Met Asp          #      18050                                                                   - Leu Pro Met Val Ser Gly Asp Arg Ile His Cy - #s Leu Asp Ile Leu Phe          #  18205                                                                       - Ala Phe Thr Lys Arg Val Leu Gly Glu Ser Gl - #y Glu Met Asp Ser Leu          #               18401830 - #                1835                               - Arg Ser Gln Met Glu Glu Arg Phe Met Ser Al - #a Asn Pro Ser Lys Val          #              18550                                                           - Ser Tyr Glu Pro Ile Thr Thr Thr Leu Lys Ar - #g Lys Gln Glu Xaa Val          #          18705                                                               - Ser Ala Thr Val Ile Gln Arg Ala Tyr Arg Ar - #g Tyr Arg Leu Arg Gln          #      18850                                                                   - Asn Val Lys Asn Ile Ser Ser Ile Tyr Ile Ly - #s Asp Gly Asp Arg Asp          #  19005                                                                       - Asp Asp Leu Leu Asn Lys Glu Asp Met Ala Ph - #e Asp Asn Val Asn Glu          #               19201910 - #                1915                               - Asn Ser Ser Pro Glu Lys Thr Asp Ala Thr Se - #r Ser Thr Thr Ser Pro          #              19350                                                           - Pro Ser Tyr Asp Ser Val Thr Lys Pro Asp Ly - #s Glu Lys Tyr Glu Xaa          #          19505                                                               - Asp Gln Thr Glu Lys Glu Asp Lys Gly Lys As - #p Ser Lys Glu Ser Lys          #      19650                                                                   - Lys                                                                          - (2) INFORMATION FOR SEQ ID NO:17:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 21 base                                                            (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: cDNA                                                 -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:                                 #21                CCCA G                                                      - (2) INFORMATION FOR SEQ ID NO:18:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 26 base                                                            (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: cDNA                                                 -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:                                 #              26  TGGA ATTGCT                                                 - (2) INFORMATION FOR SEQ ID NO:19:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 23 base                                                            (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: cDNA                                                 -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:                                 #                23GCAA TGA                                                    __________________________________________________________________________ 

What is claimed is:
 1. An isolated nucleic acid molecule comprising:(a) a polynucleotide sequence encoding the polypeptide of SEQ ID NO:10; or (b) a polynucleotide sequence that is complementary to a polynucleotide sequence set forth in (a).
 2. A method of making a recombinant vector comprising inserting the nucleic acid molecule of claim 1 into a vector.
 3. A recombinant vector comprising the nucleic acid of claim
 1. 4. A method of making a cultured recombinant host cell comprising introducing the recombinant vector of claim 3 into a cultured host cell.
 5. A cultured host cell comprising the vector of claim
 3. 6. A recombinant virion comprising the isolated nucleic acid molecule of claim
 1. 7. An isolated nucleic acid molecule comprising:(a) a polynucleotide sequence encoding the polypeptide of SEQ ID NO:2; or (b) a polynucleotide sequence that is complementary to a polynucleotide sequence set forth in (a).
 8. A method of making a recombinant vector comprising inserting the nucleic acid molecule of claim 7 into a vector.
 9. A recombinant vector comprising the nucleic acid of claim
 7. 10. A method of making a cultured host cell comprising introducing the recombinant vector of claim 9 into a cultured host cell.
 11. A cultured host cell comprising the vector of claim
 9. 12. A recombinant virion comprising, the isolated nucleic acid molecule of claim
 7. 13. An isolated nucleic acid molecule which encodes a polypeptide comprising amino acids selected from the group consisting of:(a) amino acids 18-214 of SEQ ID NO:2; (b) amino acids 229-258 of SEQ ID NO:2; (c) amino acids 268-297 of SEQ ID NO:2; (d) amino acids 300-325 of SEQ ID NO:2; (e) amino acids 326-351 of SEQ ID NO:2; (f) amino acids 474-504 of SEQ ID NO:2; (g) amino acids 501-554 of SEQ ID NO:2; (h) amino acids 553-583 of SEQ ID NO:2; (i) amino acids 589-615 of SEQ ID NO:2; (j) amino acids 619-646 of SEQ ID NO:2; (k) amino acids 233-555 of SEQ ID NO:2; (l) amino acids 554-945 of SEQ ID NO:2.
 14. The nucleic acid molecule of claim 13, wherein said molecule is a detectably labeled probe.
 15. A method of detecting PNS SCP encoding nucleic acid in a sample comprising:(a) contacting said sample with the labeled probe of claim 14; and (b) detecting the presence of said labeled probe bound to PNS SCP nucleic acid. 